Variable-grain and dynamic work generation for Minimal Unique Itemset mining

SUDA2 is a recursive search algorithm for minimal unique itemset detection. Such sets of items are formed via combinations of non-obvious attributes enabling individual record identification. The nature of SUDA2 allows work to be divided into non-overlapping tasks enabling parallel execution. Earlie...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Yiapanis, P., Haglin, D.J., Manning, A.M., Mayes, K., Keane, J.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Computer science Correlation Gain Itemsets Load modeling Particle separators Program processors
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	41
container_issue
container_start_page	33
container_title
container_volume
creator	Yiapanis, P. Haglin, D.J. Manning, A.M. Mayes, K. Keane, J.
description	SUDA2 is a recursive search algorithm for minimal unique itemset detection. Such sets of items are formed via combinations of non-obvious attributes enabling individual record identification. The nature of SUDA2 allows work to be divided into non-overlapping tasks enabling parallel execution. Earlier work developed a parallel implementation for SUDA2 on an SMP cluster, and this was found to be several orders of magnitude faster than sequential SUDA2. However, if fixed-granularity parallel tasks are scheduled naively in the order of their generation, the system load tends to be imbalanced with little work at the beginning and end of the search. This paper investigates the effectiveness of variable-grained and dynamic work generation strategies for parallel SUDA2. These methods restrict the number of sub-tasks to be generated, based on the criterion of probable work size. The further we descend in the search recursion tree, the smaller the tasks become, thus we only select the largest tasks at each level of recursion as being suitable for scheduling. The revised algorithm runs approximately twice as fast as the existing parallel SUDA2 for finer levels of granularity when variable-grained work generation is applied. The dynamic method, performing level-wise task selection based on size, outperforms the other techniques investigated.
doi_str_mv	10.1109/CLUSTR.2008.4663753
format	Conference Proceeding
fullrecord	<record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_4663753</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>4663753</ieee_id><sourcerecordid>4663753</sourcerecordid><originalsourceid>FETCH-LOGICAL-i175t-efa7e4af33065aaf68edb5d1dc0b504f0c10fdd2baa13b8ca821e7901813f3a33</originalsourceid><addsrcrecordid>eNotkFtLAzEUhOMNrLW_oC_5A1tzctlNHqV4KawI2vpazm5OSrSbanZF_Pcu2HkZ-AaGYRibg1gACHezrDev65eFFMIudFmqyqgTNnOVBS21lqUW7pRNJJS2cNKoM3Z1DJSDczYBY2RhRnDJZn3_LkZpo4wTE1a_YY7Y7KnYZYyJY_Lc_ybsYst_DvmD7yhRxiEeEg-HzJ9iih3u-SbFr2_iq4G6ngbejTjtrtlFwH1Ps6NP2eb-br18LOrnh9Xyti4iVGYoKGBFGoNSojSIobTkG-PBt6IxQgfRggjeywYRVGNbtBKocgIsqKBQqSmb__dGItp-5nFR_t0ej1F_hj5UIQ</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Variable-grain and dynamic work generation for Minimal Unique Itemset mining</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Yiapanis, P. ; Haglin, D.J. ; Manning, A.M. ; Mayes, K. ; Keane, J.</creator><creatorcontrib>Yiapanis, P. ; Haglin, D.J. ; Manning, A.M. ; Mayes, K. ; Keane, J.</creatorcontrib><description>SUDA2 is a recursive search algorithm for minimal unique itemset detection. Such sets of items are formed via combinations of non-obvious attributes enabling individual record identification. The nature of SUDA2 allows work to be divided into non-overlapping tasks enabling parallel execution. Earlier work developed a parallel implementation for SUDA2 on an SMP cluster, and this was found to be several orders of magnitude faster than sequential SUDA2. However, if fixed-granularity parallel tasks are scheduled naively in the order of their generation, the system load tends to be imbalanced with little work at the beginning and end of the search. This paper investigates the effectiveness of variable-grained and dynamic work generation strategies for parallel SUDA2. These methods restrict the number of sub-tasks to be generated, based on the criterion of probable work size. The further we descend in the search recursion tree, the smaller the tasks become, thus we only select the largest tasks at each level of recursion as being suitable for scheduling. The revised algorithm runs approximately twice as fast as the existing parallel SUDA2 for finer levels of granularity when variable-grained work generation is applied. The dynamic method, performing level-wise task selection based on size, outperforms the other techniques investigated.</description><identifier>ISSN: 1552-5244</identifier><identifier>ISBN: 1424426391</identifier><identifier>ISBN: 9781424426393</identifier><identifier>EISSN: 2168-9253</identifier><identifier>EISBN: 9781424426409</identifier><identifier>EISBN: 1424426405</identifier><identifier>DOI: 10.1109/CLUSTR.2008.4663753</identifier><language>eng</language><publisher>IEEE</publisher><subject>Computer science ; Correlation ; Gain ; Itemsets ; Load modeling ; Particle separators ; Program processors</subject><ispartof>2008 IEEE International Conference on Cluster Computing, 2008, p.33-41</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/4663753$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/4663753$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Yiapanis, P.</creatorcontrib><creatorcontrib>Haglin, D.J.</creatorcontrib><creatorcontrib>Manning, A.M.</creatorcontrib><creatorcontrib>Mayes, K.</creatorcontrib><creatorcontrib>Keane, J.</creatorcontrib><title>Variable-grain and dynamic work generation for Minimal Unique Itemset mining</title><title>2008 IEEE International Conference on Cluster Computing</title><addtitle>CLUSTR</addtitle><description>SUDA2 is a recursive search algorithm for minimal unique itemset detection. Such sets of items are formed via combinations of non-obvious attributes enabling individual record identification. The nature of SUDA2 allows work to be divided into non-overlapping tasks enabling parallel execution. Earlier work developed a parallel implementation for SUDA2 on an SMP cluster, and this was found to be several orders of magnitude faster than sequential SUDA2. However, if fixed-granularity parallel tasks are scheduled naively in the order of their generation, the system load tends to be imbalanced with little work at the beginning and end of the search. This paper investigates the effectiveness of variable-grained and dynamic work generation strategies for parallel SUDA2. These methods restrict the number of sub-tasks to be generated, based on the criterion of probable work size. The further we descend in the search recursion tree, the smaller the tasks become, thus we only select the largest tasks at each level of recursion as being suitable for scheduling. The revised algorithm runs approximately twice as fast as the existing parallel SUDA2 for finer levels of granularity when variable-grained work generation is applied. The dynamic method, performing level-wise task selection based on size, outperforms the other techniques investigated.</description><subject>Computer science</subject><subject>Correlation</subject><subject>Gain</subject><subject>Itemsets</subject><subject>Load modeling</subject><subject>Particle separators</subject><subject>Program processors</subject><issn>1552-5244</issn><issn>2168-9253</issn><isbn>1424426391</isbn><isbn>9781424426393</isbn><isbn>9781424426409</isbn><isbn>1424426405</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2008</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNotkFtLAzEUhOMNrLW_oC_5A1tzctlNHqV4KawI2vpazm5OSrSbanZF_Pcu2HkZ-AaGYRibg1gACHezrDev65eFFMIudFmqyqgTNnOVBS21lqUW7pRNJJS2cNKoM3Z1DJSDczYBY2RhRnDJZn3_LkZpo4wTE1a_YY7Y7KnYZYyJY_Lc_ybsYst_DvmD7yhRxiEeEg-HzJ9iih3u-SbFr2_iq4G6ngbejTjtrtlFwH1Ps6NP2eb-br18LOrnh9Xyti4iVGYoKGBFGoNSojSIobTkG-PBt6IxQgfRggjeywYRVGNbtBKocgIsqKBQqSmb__dGItp-5nFR_t0ej1F_hj5UIQ</recordid><startdate>200809</startdate><enddate>200809</enddate><creator>Yiapanis, P.</creator><creator>Haglin, D.J.</creator><creator>Manning, A.M.</creator><creator>Mayes, K.</creator><creator>Keane, J.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>200809</creationdate><title>Variable-grain and dynamic work generation for Minimal Unique Itemset mining</title><author>Yiapanis, P. ; Haglin, D.J. ; Manning, A.M. ; Mayes, K. ; Keane, J.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i175t-efa7e4af33065aaf68edb5d1dc0b504f0c10fdd2baa13b8ca821e7901813f3a33</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2008</creationdate><topic>Computer science</topic><topic>Correlation</topic><topic>Gain</topic><topic>Itemsets</topic><topic>Load modeling</topic><topic>Particle separators</topic><topic>Program processors</topic><toplevel>online_resources</toplevel><creatorcontrib>Yiapanis, P.</creatorcontrib><creatorcontrib>Haglin, D.J.</creatorcontrib><creatorcontrib>Manning, A.M.</creatorcontrib><creatorcontrib>Mayes, K.</creatorcontrib><creatorcontrib>Keane, J.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yiapanis, P.</au><au>Haglin, D.J.</au><au>Manning, A.M.</au><au>Mayes, K.</au><au>Keane, J.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Variable-grain and dynamic work generation for Minimal Unique Itemset mining</atitle><btitle>2008 IEEE International Conference on Cluster Computing</btitle><stitle>CLUSTR</stitle><date>2008-09</date><risdate>2008</risdate><spage>33</spage><epage>41</epage><pages>33-41</pages><issn>1552-5244</issn><eissn>2168-9253</eissn><isbn>1424426391</isbn><isbn>9781424426393</isbn><eisbn>9781424426409</eisbn><eisbn>1424426405</eisbn><abstract>SUDA2 is a recursive search algorithm for minimal unique itemset detection. Such sets of items are formed via combinations of non-obvious attributes enabling individual record identification. The nature of SUDA2 allows work to be divided into non-overlapping tasks enabling parallel execution. Earlier work developed a parallel implementation for SUDA2 on an SMP cluster, and this was found to be several orders of magnitude faster than sequential SUDA2. However, if fixed-granularity parallel tasks are scheduled naively in the order of their generation, the system load tends to be imbalanced with little work at the beginning and end of the search. This paper investigates the effectiveness of variable-grained and dynamic work generation strategies for parallel SUDA2. These methods restrict the number of sub-tasks to be generated, based on the criterion of probable work size. The further we descend in the search recursion tree, the smaller the tasks become, thus we only select the largest tasks at each level of recursion as being suitable for scheduling. The revised algorithm runs approximately twice as fast as the existing parallel SUDA2 for finer levels of granularity when variable-grained work generation is applied. The dynamic method, performing level-wise task selection based on size, outperforms the other techniques investigated.</abstract><pub>IEEE</pub><doi>10.1109/CLUSTR.2008.4663753</doi><tpages>9</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1552-5244
ispartof	2008 IEEE International Conference on Cluster Computing, 2008, p.33-41
issn	1552-5244 2168-9253
language	eng
recordid	cdi_ieee_primary_4663753
source	IEEE Electronic Library (IEL) Conference Proceedings
subjects	Computer science Correlation Gain Itemsets Load modeling Particle separators Program processors
title	Variable-grain and dynamic work generation for Minimal Unique Itemset mining
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T01%3A40%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Variable-grain%20and%20dynamic%20work%20generation%20for%20Minimal%20Unique%20Itemset%20mining&rft.btitle=2008%20IEEE%20International%20Conference%20on%20Cluster%20Computing&rft.au=Yiapanis,%20P.&rft.date=2008-09&rft.spage=33&rft.epage=41&rft.pages=33-41&rft.issn=1552-5244&rft.eissn=2168-9253&rft.isbn=1424426391&rft.isbn_list=9781424426393&rft_id=info:doi/10.1109/CLUSTR.2008.4663753&rft_dat=%3Cieee_6IE%3E4663753%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9781424426409&rft.eisbn_list=1424426405&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=4663753&rfr_iscdi=true