TRICE: Mining Frequent Itemsets by Iterative TRimmed Transaction LattICE in Sparse Big Data
Sparseness is often witnessed in big data emanating from a variety of sources, including IoT, pervasive computing, and behavioral data. Frequent itemset mining is the first and foremost step of association rule mining, which is a distinguished unsupervised machine learning problem. However, techniqu...
Gespeichert in:
Veröffentlicht in: | IEEE access 2019, Vol.7, p.181688-181705 |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 181705 |
---|---|
container_issue | |
container_start_page | 181688 |
container_title | IEEE access |
container_volume | 7 |
creator | Yasir, Muhammad Habib, Muhammad Asif Ashraf, Muhammad Sarwar, Shahzad Chaudhry, Muhammad Umar Shahwani, Hamayoun Ahmad, Mudassar Muhammad Nadeem Faisal, Ch |
description | Sparseness is often witnessed in big data emanating from a variety of sources, including IoT, pervasive computing, and behavioral data. Frequent itemset mining is the first and foremost step of association rule mining, which is a distinguished unsupervised machine learning problem. However, techniques for frequent itemset mining are least explored for sparse real-world data, showing somewhat comparable performance. On the contrary, the methods are adequately validated for dense data and stand apart from each other in terms of performance. Hence, there arises an immense need for evaluating these techniques as well as proposing new ones for large sparse real-world datasets. In this study, a novel method: Mining Frequent Itemsets by Iterative TRimmed Transaction lattICE (TRICE) is proposed. TRICE iteratively generates combinations of varying-sized trimmed subsets of I, where I denote the set of distinct items in a database. Extensive experiments are conducted to assess TRICE against HARPP, FP-Growth, optimized SaM, and optimized RElim algorithms. The experimental results show that TRICE outperforms all these algorithms both in terms of running time and memory consumption. TRICE maintains a substantial performance gap for all sparse real-world datasets on all minimum support thresholds. Moreover, assessment of memory use of optimized SaM and RElim algorithms has been performed for the first time. |
doi_str_mv | 10.1109/ACCESS.2019.2959878 |
format | Article |
fullrecord | <record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_proquest_journals_2455596367</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8933017</ieee_id><doaj_id>oai_doaj_org_article_f92c0bc7a7cc42adb5ac14925afcf09f</doaj_id><sourcerecordid>2455596367</sourcerecordid><originalsourceid>FETCH-LOGICAL-c408t-a3c4d1402ea4afd8927072575f8cbd06b7cd03db8487b53ebedf9af4977641363</originalsourceid><addsrcrecordid>eNpNUU1rGzEQXUIKDW5-QS6CnO3ocyX1lm6c1OAQiN1TD2JWH0Ym3nUkOeB_n3XXhM5lhsd7b4Z5VXVD8IwQrO_um2a-Ws0oJnpGtdBKqovqipJaT5lg9eV_8_fqOuctHkoNkJBX1d_166KZ_0TPsYvdBj0m_37wXUGL4nfZl4za42lOUOKHR-vXuNt5h9YJugy2xL5DSyhlsECxQ6s9pOzRr7hBD1DgR_UtwFv21-c-qf48ztfN7-ny5WnR3C-nlmNVpsAsd4Rj6oFDcEpTiSUVUgRlW4frVlqHmWsVV7IVzLfeBQ2BaylrTljNJtVi9HU9bM0-xR2ko-khmn9AnzYGUon2zZugqcWtlSCt5RRcK8ASrqmAYAPWYfC6Hb32qR8-kYvZ9ofUDecbyoUQuma1HFhsZNnU55x8-NpKsDmFYsZQzCkUcw5lUN2Mqui9_1IozRgmkn0CpiyHgA</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2455596367</pqid></control><display><type>article</type><title>TRICE: Mining Frequent Itemsets by Iterative TRimmed Transaction LattICE in Sparse Big Data</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Yasir, Muhammad ; Habib, Muhammad Asif ; Ashraf, Muhammad ; Sarwar, Shahzad ; Chaudhry, Muhammad Umar ; Shahwani, Hamayoun ; Ahmad, Mudassar ; Muhammad Nadeem Faisal, Ch</creator><creatorcontrib>Yasir, Muhammad ; Habib, Muhammad Asif ; Ashraf, Muhammad ; Sarwar, Shahzad ; Chaudhry, Muhammad Umar ; Shahwani, Hamayoun ; Ahmad, Mudassar ; Muhammad Nadeem Faisal, Ch</creatorcontrib><description>Sparseness is often witnessed in big data emanating from a variety of sources, including IoT, pervasive computing, and behavioral data. Frequent itemset mining is the first and foremost step of association rule mining, which is a distinguished unsupervised machine learning problem. However, techniques for frequent itemset mining are least explored for sparse real-world data, showing somewhat comparable performance. On the contrary, the methods are adequately validated for dense data and stand apart from each other in terms of performance. Hence, there arises an immense need for evaluating these techniques as well as proposing new ones for large sparse real-world datasets. In this study, a novel method: Mining Frequent Itemsets by Iterative TRimmed Transaction lattICE (TRICE) is proposed. TRICE iteratively generates combinations of varying-sized trimmed subsets of I, where I denote the set of distinct items in a database. Extensive experiments are conducted to assess TRICE against HARPP, FP-Growth, optimized SaM, and optimized RElim algorithms. The experimental results show that TRICE outperforms all these algorithms both in terms of running time and memory consumption. TRICE maintains a substantial performance gap for all sparse real-world datasets on all minimum support thresholds. Moreover, assessment of memory use of optimized SaM and RElim algorithms has been performed for the first time.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2019.2959878</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Algorithms ; Association rules ; Big Data ; big data applications ; Data mining ; Datasets ; frequent itemset mining ; Information technology ; Itemsets ; Iterative methods ; Lattices ; Machine learning ; Memory management ; pattern recognition ; pervasive computing ; Run time (computers) ; Ubiquitous computing</subject><ispartof>IEEE access, 2019, Vol.7, p.181688-181705</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c408t-a3c4d1402ea4afd8927072575f8cbd06b7cd03db8487b53ebedf9af4977641363</citedby><cites>FETCH-LOGICAL-c408t-a3c4d1402ea4afd8927072575f8cbd06b7cd03db8487b53ebedf9af4977641363</cites><orcidid>0000-0002-6366-8230 ; 0000-0003-3074-9162 ; 0000-0003-2211-8360 ; 0000-0001-8781-4143</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8933017$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,860,2096,4010,27610,27900,27901,27902,54908</link.rule.ids></links><search><creatorcontrib>Yasir, Muhammad</creatorcontrib><creatorcontrib>Habib, Muhammad Asif</creatorcontrib><creatorcontrib>Ashraf, Muhammad</creatorcontrib><creatorcontrib>Sarwar, Shahzad</creatorcontrib><creatorcontrib>Chaudhry, Muhammad Umar</creatorcontrib><creatorcontrib>Shahwani, Hamayoun</creatorcontrib><creatorcontrib>Ahmad, Mudassar</creatorcontrib><creatorcontrib>Muhammad Nadeem Faisal, Ch</creatorcontrib><title>TRICE: Mining Frequent Itemsets by Iterative TRimmed Transaction LattICE in Sparse Big Data</title><title>IEEE access</title><addtitle>Access</addtitle><description>Sparseness is often witnessed in big data emanating from a variety of sources, including IoT, pervasive computing, and behavioral data. Frequent itemset mining is the first and foremost step of association rule mining, which is a distinguished unsupervised machine learning problem. However, techniques for frequent itemset mining are least explored for sparse real-world data, showing somewhat comparable performance. On the contrary, the methods are adequately validated for dense data and stand apart from each other in terms of performance. Hence, there arises an immense need for evaluating these techniques as well as proposing new ones for large sparse real-world datasets. In this study, a novel method: Mining Frequent Itemsets by Iterative TRimmed Transaction lattICE (TRICE) is proposed. TRICE iteratively generates combinations of varying-sized trimmed subsets of I, where I denote the set of distinct items in a database. Extensive experiments are conducted to assess TRICE against HARPP, FP-Growth, optimized SaM, and optimized RElim algorithms. The experimental results show that TRICE outperforms all these algorithms both in terms of running time and memory consumption. TRICE maintains a substantial performance gap for all sparse real-world datasets on all minimum support thresholds. Moreover, assessment of memory use of optimized SaM and RElim algorithms has been performed for the first time.</description><subject>Algorithms</subject><subject>Association rules</subject><subject>Big Data</subject><subject>big data applications</subject><subject>Data mining</subject><subject>Datasets</subject><subject>frequent itemset mining</subject><subject>Information technology</subject><subject>Itemsets</subject><subject>Iterative methods</subject><subject>Lattices</subject><subject>Machine learning</subject><subject>Memory management</subject><subject>pattern recognition</subject><subject>pervasive computing</subject><subject>Run time (computers)</subject><subject>Ubiquitous computing</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUU1rGzEQXUIKDW5-QS6CnO3ocyX1lm6c1OAQiN1TD2JWH0Ym3nUkOeB_n3XXhM5lhsd7b4Z5VXVD8IwQrO_um2a-Ws0oJnpGtdBKqovqipJaT5lg9eV_8_fqOuctHkoNkJBX1d_166KZ_0TPsYvdBj0m_37wXUGL4nfZl4za42lOUOKHR-vXuNt5h9YJugy2xL5DSyhlsECxQ6s9pOzRr7hBD1DgR_UtwFv21-c-qf48ztfN7-ny5WnR3C-nlmNVpsAsd4Rj6oFDcEpTiSUVUgRlW4frVlqHmWsVV7IVzLfeBQ2BaylrTljNJtVi9HU9bM0-xR2ko-khmn9AnzYGUon2zZugqcWtlSCt5RRcK8ASrqmAYAPWYfC6Hb32qR8-kYvZ9ofUDecbyoUQuma1HFhsZNnU55x8-NpKsDmFYsZQzCkUcw5lUN2Mqui9_1IozRgmkn0CpiyHgA</recordid><startdate>2019</startdate><enddate>2019</enddate><creator>Yasir, Muhammad</creator><creator>Habib, Muhammad Asif</creator><creator>Ashraf, Muhammad</creator><creator>Sarwar, Shahzad</creator><creator>Chaudhry, Muhammad Umar</creator><creator>Shahwani, Hamayoun</creator><creator>Ahmad, Mudassar</creator><creator>Muhammad Nadeem Faisal, Ch</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-6366-8230</orcidid><orcidid>https://orcid.org/0000-0003-3074-9162</orcidid><orcidid>https://orcid.org/0000-0003-2211-8360</orcidid><orcidid>https://orcid.org/0000-0001-8781-4143</orcidid></search><sort><creationdate>2019</creationdate><title>TRICE: Mining Frequent Itemsets by Iterative TRimmed Transaction LattICE in Sparse Big Data</title><author>Yasir, Muhammad ; Habib, Muhammad Asif ; Ashraf, Muhammad ; Sarwar, Shahzad ; Chaudhry, Muhammad Umar ; Shahwani, Hamayoun ; Ahmad, Mudassar ; Muhammad Nadeem Faisal, Ch</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c408t-a3c4d1402ea4afd8927072575f8cbd06b7cd03db8487b53ebedf9af4977641363</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Algorithms</topic><topic>Association rules</topic><topic>Big Data</topic><topic>big data applications</topic><topic>Data mining</topic><topic>Datasets</topic><topic>frequent itemset mining</topic><topic>Information technology</topic><topic>Itemsets</topic><topic>Iterative methods</topic><topic>Lattices</topic><topic>Machine learning</topic><topic>Memory management</topic><topic>pattern recognition</topic><topic>pervasive computing</topic><topic>Run time (computers)</topic><topic>Ubiquitous computing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yasir, Muhammad</creatorcontrib><creatorcontrib>Habib, Muhammad Asif</creatorcontrib><creatorcontrib>Ashraf, Muhammad</creatorcontrib><creatorcontrib>Sarwar, Shahzad</creatorcontrib><creatorcontrib>Chaudhry, Muhammad Umar</creatorcontrib><creatorcontrib>Shahwani, Hamayoun</creatorcontrib><creatorcontrib>Ahmad, Mudassar</creatorcontrib><creatorcontrib>Muhammad Nadeem Faisal, Ch</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yasir, Muhammad</au><au>Habib, Muhammad Asif</au><au>Ashraf, Muhammad</au><au>Sarwar, Shahzad</au><au>Chaudhry, Muhammad Umar</au><au>Shahwani, Hamayoun</au><au>Ahmad, Mudassar</au><au>Muhammad Nadeem Faisal, Ch</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>TRICE: Mining Frequent Itemsets by Iterative TRimmed Transaction LattICE in Sparse Big Data</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2019</date><risdate>2019</risdate><volume>7</volume><spage>181688</spage><epage>181705</epage><pages>181688-181705</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>Sparseness is often witnessed in big data emanating from a variety of sources, including IoT, pervasive computing, and behavioral data. Frequent itemset mining is the first and foremost step of association rule mining, which is a distinguished unsupervised machine learning problem. However, techniques for frequent itemset mining are least explored for sparse real-world data, showing somewhat comparable performance. On the contrary, the methods are adequately validated for dense data and stand apart from each other in terms of performance. Hence, there arises an immense need for evaluating these techniques as well as proposing new ones for large sparse real-world datasets. In this study, a novel method: Mining Frequent Itemsets by Iterative TRimmed Transaction lattICE (TRICE) is proposed. TRICE iteratively generates combinations of varying-sized trimmed subsets of I, where I denote the set of distinct items in a database. Extensive experiments are conducted to assess TRICE against HARPP, FP-Growth, optimized SaM, and optimized RElim algorithms. The experimental results show that TRICE outperforms all these algorithms both in terms of running time and memory consumption. TRICE maintains a substantial performance gap for all sparse real-world datasets on all minimum support thresholds. Moreover, assessment of memory use of optimized SaM and RElim algorithms has been performed for the first time.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2019.2959878</doi><tpages>18</tpages><orcidid>https://orcid.org/0000-0002-6366-8230</orcidid><orcidid>https://orcid.org/0000-0003-3074-9162</orcidid><orcidid>https://orcid.org/0000-0003-2211-8360</orcidid><orcidid>https://orcid.org/0000-0001-8781-4143</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2169-3536 |
ispartof | IEEE access, 2019, Vol.7, p.181688-181705 |
issn | 2169-3536 2169-3536 |
language | eng |
recordid | cdi_proquest_journals_2455596367 |
source | IEEE Open Access Journals; DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals |
subjects | Algorithms Association rules Big Data big data applications Data mining Datasets frequent itemset mining Information technology Itemsets Iterative methods Lattices Machine learning Memory management pattern recognition pervasive computing Run time (computers) Ubiquitous computing |
title | TRICE: Mining Frequent Itemsets by Iterative TRimmed Transaction LattICE in Sparse Big Data |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-21T18%3A36%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=TRICE:%20Mining%20Frequent%20Itemsets%20by%20Iterative%20TRimmed%20Transaction%20LattICE%20in%20Sparse%20Big%20Data&rft.jtitle=IEEE%20access&rft.au=Yasir,%20Muhammad&rft.date=2019&rft.volume=7&rft.spage=181688&rft.epage=181705&rft.pages=181688-181705&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2019.2959878&rft_dat=%3Cproquest_doaj_%3E2455596367%3C/proquest_doaj_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2455596367&rft_id=info:pmid/&rft_ieee_id=8933017&rft_doaj_id=oai_doaj_org_article_f92c0bc7a7cc42adb5ac14925afcf09f&rfr_iscdi=true |