TRICE: Mining Frequent Itemsets by Iterative TRimmed Transaction LattICE in Sparse Big Data

Sparseness is often witnessed in big data emanating from a variety of sources, including IoT, pervasive computing, and behavioral data. Frequent itemset mining is the first and foremost step of association rule mining, which is a distinguished unsupervised machine learning problem. However, techniqu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2019, Vol.7, p.181688-181705
Hauptverfasser: Yasir, Muhammad, Habib, Muhammad Asif, Ashraf, Muhammad, Sarwar, Shahzad, Chaudhry, Muhammad Umar, Shahwani, Hamayoun, Ahmad, Mudassar, Muhammad Nadeem Faisal, Ch
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 181705
container_issue
container_start_page 181688
container_title IEEE access
container_volume 7
creator Yasir, Muhammad
Habib, Muhammad Asif
Ashraf, Muhammad
Sarwar, Shahzad
Chaudhry, Muhammad Umar
Shahwani, Hamayoun
Ahmad, Mudassar
Muhammad Nadeem Faisal, Ch
description Sparseness is often witnessed in big data emanating from a variety of sources, including IoT, pervasive computing, and behavioral data. Frequent itemset mining is the first and foremost step of association rule mining, which is a distinguished unsupervised machine learning problem. However, techniques for frequent itemset mining are least explored for sparse real-world data, showing somewhat comparable performance. On the contrary, the methods are adequately validated for dense data and stand apart from each other in terms of performance. Hence, there arises an immense need for evaluating these techniques as well as proposing new ones for large sparse real-world datasets. In this study, a novel method: Mining Frequent Itemsets by Iterative TRimmed Transaction lattICE (TRICE) is proposed. TRICE iteratively generates combinations of varying-sized trimmed subsets of I, where I denote the set of distinct items in a database. Extensive experiments are conducted to assess TRICE against HARPP, FP-Growth, optimized SaM, and optimized RElim algorithms. The experimental results show that TRICE outperforms all these algorithms both in terms of running time and memory consumption. TRICE maintains a substantial performance gap for all sparse real-world datasets on all minimum support thresholds. Moreover, assessment of memory use of optimized SaM and RElim algorithms has been performed for the first time.
doi_str_mv 10.1109/ACCESS.2019.2959878
format Article
fullrecord <record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_proquest_journals_2455596367</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8933017</ieee_id><doaj_id>oai_doaj_org_article_f92c0bc7a7cc42adb5ac14925afcf09f</doaj_id><sourcerecordid>2455596367</sourcerecordid><originalsourceid>FETCH-LOGICAL-c408t-a3c4d1402ea4afd8927072575f8cbd06b7cd03db8487b53ebedf9af4977641363</originalsourceid><addsrcrecordid>eNpNUU1rGzEQXUIKDW5-QS6CnO3ocyX1lm6c1OAQiN1TD2JWH0Ym3nUkOeB_n3XXhM5lhsd7b4Z5VXVD8IwQrO_um2a-Ws0oJnpGtdBKqovqipJaT5lg9eV_8_fqOuctHkoNkJBX1d_166KZ_0TPsYvdBj0m_37wXUGL4nfZl4za42lOUOKHR-vXuNt5h9YJugy2xL5DSyhlsECxQ6s9pOzRr7hBD1DgR_UtwFv21-c-qf48ztfN7-ny5WnR3C-nlmNVpsAsd4Rj6oFDcEpTiSUVUgRlW4frVlqHmWsVV7IVzLfeBQ2BaylrTljNJtVi9HU9bM0-xR2ko-khmn9AnzYGUon2zZugqcWtlSCt5RRcK8ASrqmAYAPWYfC6Hb32qR8-kYvZ9ofUDecbyoUQuma1HFhsZNnU55x8-NpKsDmFYsZQzCkUcw5lUN2Mqui9_1IozRgmkn0CpiyHgA</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2455596367</pqid></control><display><type>article</type><title>TRICE: Mining Frequent Itemsets by Iterative TRimmed Transaction LattICE in Sparse Big Data</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Yasir, Muhammad ; Habib, Muhammad Asif ; Ashraf, Muhammad ; Sarwar, Shahzad ; Chaudhry, Muhammad Umar ; Shahwani, Hamayoun ; Ahmad, Mudassar ; Muhammad Nadeem Faisal, Ch</creator><creatorcontrib>Yasir, Muhammad ; Habib, Muhammad Asif ; Ashraf, Muhammad ; Sarwar, Shahzad ; Chaudhry, Muhammad Umar ; Shahwani, Hamayoun ; Ahmad, Mudassar ; Muhammad Nadeem Faisal, Ch</creatorcontrib><description>Sparseness is often witnessed in big data emanating from a variety of sources, including IoT, pervasive computing, and behavioral data. Frequent itemset mining is the first and foremost step of association rule mining, which is a distinguished unsupervised machine learning problem. However, techniques for frequent itemset mining are least explored for sparse real-world data, showing somewhat comparable performance. On the contrary, the methods are adequately validated for dense data and stand apart from each other in terms of performance. Hence, there arises an immense need for evaluating these techniques as well as proposing new ones for large sparse real-world datasets. In this study, a novel method: Mining Frequent Itemsets by Iterative TRimmed Transaction lattICE (TRICE) is proposed. TRICE iteratively generates combinations of varying-sized trimmed subsets of I, where I denote the set of distinct items in a database. Extensive experiments are conducted to assess TRICE against HARPP, FP-Growth, optimized SaM, and optimized RElim algorithms. The experimental results show that TRICE outperforms all these algorithms both in terms of running time and memory consumption. TRICE maintains a substantial performance gap for all sparse real-world datasets on all minimum support thresholds. Moreover, assessment of memory use of optimized SaM and RElim algorithms has been performed for the first time.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2019.2959878</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Algorithms ; Association rules ; Big Data ; big data applications ; Data mining ; Datasets ; frequent itemset mining ; Information technology ; Itemsets ; Iterative methods ; Lattices ; Machine learning ; Memory management ; pattern recognition ; pervasive computing ; Run time (computers) ; Ubiquitous computing</subject><ispartof>IEEE access, 2019, Vol.7, p.181688-181705</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c408t-a3c4d1402ea4afd8927072575f8cbd06b7cd03db8487b53ebedf9af4977641363</citedby><cites>FETCH-LOGICAL-c408t-a3c4d1402ea4afd8927072575f8cbd06b7cd03db8487b53ebedf9af4977641363</cites><orcidid>0000-0002-6366-8230 ; 0000-0003-3074-9162 ; 0000-0003-2211-8360 ; 0000-0001-8781-4143</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8933017$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,860,2096,4010,27610,27900,27901,27902,54908</link.rule.ids></links><search><creatorcontrib>Yasir, Muhammad</creatorcontrib><creatorcontrib>Habib, Muhammad Asif</creatorcontrib><creatorcontrib>Ashraf, Muhammad</creatorcontrib><creatorcontrib>Sarwar, Shahzad</creatorcontrib><creatorcontrib>Chaudhry, Muhammad Umar</creatorcontrib><creatorcontrib>Shahwani, Hamayoun</creatorcontrib><creatorcontrib>Ahmad, Mudassar</creatorcontrib><creatorcontrib>Muhammad Nadeem Faisal, Ch</creatorcontrib><title>TRICE: Mining Frequent Itemsets by Iterative TRimmed Transaction LattICE in Sparse Big Data</title><title>IEEE access</title><addtitle>Access</addtitle><description>Sparseness is often witnessed in big data emanating from a variety of sources, including IoT, pervasive computing, and behavioral data. Frequent itemset mining is the first and foremost step of association rule mining, which is a distinguished unsupervised machine learning problem. However, techniques for frequent itemset mining are least explored for sparse real-world data, showing somewhat comparable performance. On the contrary, the methods are adequately validated for dense data and stand apart from each other in terms of performance. Hence, there arises an immense need for evaluating these techniques as well as proposing new ones for large sparse real-world datasets. In this study, a novel method: Mining Frequent Itemsets by Iterative TRimmed Transaction lattICE (TRICE) is proposed. TRICE iteratively generates combinations of varying-sized trimmed subsets of I, where I denote the set of distinct items in a database. Extensive experiments are conducted to assess TRICE against HARPP, FP-Growth, optimized SaM, and optimized RElim algorithms. The experimental results show that TRICE outperforms all these algorithms both in terms of running time and memory consumption. TRICE maintains a substantial performance gap for all sparse real-world datasets on all minimum support thresholds. Moreover, assessment of memory use of optimized SaM and RElim algorithms has been performed for the first time.</description><subject>Algorithms</subject><subject>Association rules</subject><subject>Big Data</subject><subject>big data applications</subject><subject>Data mining</subject><subject>Datasets</subject><subject>frequent itemset mining</subject><subject>Information technology</subject><subject>Itemsets</subject><subject>Iterative methods</subject><subject>Lattices</subject><subject>Machine learning</subject><subject>Memory management</subject><subject>pattern recognition</subject><subject>pervasive computing</subject><subject>Run time (computers)</subject><subject>Ubiquitous computing</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUU1rGzEQXUIKDW5-QS6CnO3ocyX1lm6c1OAQiN1TD2JWH0Ym3nUkOeB_n3XXhM5lhsd7b4Z5VXVD8IwQrO_um2a-Ws0oJnpGtdBKqovqipJaT5lg9eV_8_fqOuctHkoNkJBX1d_166KZ_0TPsYvdBj0m_37wXUGL4nfZl4za42lOUOKHR-vXuNt5h9YJugy2xL5DSyhlsECxQ6s9pOzRr7hBD1DgR_UtwFv21-c-qf48ztfN7-ny5WnR3C-nlmNVpsAsd4Rj6oFDcEpTiSUVUgRlW4frVlqHmWsVV7IVzLfeBQ2BaylrTljNJtVi9HU9bM0-xR2ko-khmn9AnzYGUon2zZugqcWtlSCt5RRcK8ASrqmAYAPWYfC6Hb32qR8-kYvZ9ofUDecbyoUQuma1HFhsZNnU55x8-NpKsDmFYsZQzCkUcw5lUN2Mqui9_1IozRgmkn0CpiyHgA</recordid><startdate>2019</startdate><enddate>2019</enddate><creator>Yasir, Muhammad</creator><creator>Habib, Muhammad Asif</creator><creator>Ashraf, Muhammad</creator><creator>Sarwar, Shahzad</creator><creator>Chaudhry, Muhammad Umar</creator><creator>Shahwani, Hamayoun</creator><creator>Ahmad, Mudassar</creator><creator>Muhammad Nadeem Faisal, Ch</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-6366-8230</orcidid><orcidid>https://orcid.org/0000-0003-3074-9162</orcidid><orcidid>https://orcid.org/0000-0003-2211-8360</orcidid><orcidid>https://orcid.org/0000-0001-8781-4143</orcidid></search><sort><creationdate>2019</creationdate><title>TRICE: Mining Frequent Itemsets by Iterative TRimmed Transaction LattICE in Sparse Big Data</title><author>Yasir, Muhammad ; Habib, Muhammad Asif ; Ashraf, Muhammad ; Sarwar, Shahzad ; Chaudhry, Muhammad Umar ; Shahwani, Hamayoun ; Ahmad, Mudassar ; Muhammad Nadeem Faisal, Ch</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c408t-a3c4d1402ea4afd8927072575f8cbd06b7cd03db8487b53ebedf9af4977641363</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Algorithms</topic><topic>Association rules</topic><topic>Big Data</topic><topic>big data applications</topic><topic>Data mining</topic><topic>Datasets</topic><topic>frequent itemset mining</topic><topic>Information technology</topic><topic>Itemsets</topic><topic>Iterative methods</topic><topic>Lattices</topic><topic>Machine learning</topic><topic>Memory management</topic><topic>pattern recognition</topic><topic>pervasive computing</topic><topic>Run time (computers)</topic><topic>Ubiquitous computing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yasir, Muhammad</creatorcontrib><creatorcontrib>Habib, Muhammad Asif</creatorcontrib><creatorcontrib>Ashraf, Muhammad</creatorcontrib><creatorcontrib>Sarwar, Shahzad</creatorcontrib><creatorcontrib>Chaudhry, Muhammad Umar</creatorcontrib><creatorcontrib>Shahwani, Hamayoun</creatorcontrib><creatorcontrib>Ahmad, Mudassar</creatorcontrib><creatorcontrib>Muhammad Nadeem Faisal, Ch</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yasir, Muhammad</au><au>Habib, Muhammad Asif</au><au>Ashraf, Muhammad</au><au>Sarwar, Shahzad</au><au>Chaudhry, Muhammad Umar</au><au>Shahwani, Hamayoun</au><au>Ahmad, Mudassar</au><au>Muhammad Nadeem Faisal, Ch</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>TRICE: Mining Frequent Itemsets by Iterative TRimmed Transaction LattICE in Sparse Big Data</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2019</date><risdate>2019</risdate><volume>7</volume><spage>181688</spage><epage>181705</epage><pages>181688-181705</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>Sparseness is often witnessed in big data emanating from a variety of sources, including IoT, pervasive computing, and behavioral data. Frequent itemset mining is the first and foremost step of association rule mining, which is a distinguished unsupervised machine learning problem. However, techniques for frequent itemset mining are least explored for sparse real-world data, showing somewhat comparable performance. On the contrary, the methods are adequately validated for dense data and stand apart from each other in terms of performance. Hence, there arises an immense need for evaluating these techniques as well as proposing new ones for large sparse real-world datasets. In this study, a novel method: Mining Frequent Itemsets by Iterative TRimmed Transaction lattICE (TRICE) is proposed. TRICE iteratively generates combinations of varying-sized trimmed subsets of I, where I denote the set of distinct items in a database. Extensive experiments are conducted to assess TRICE against HARPP, FP-Growth, optimized SaM, and optimized RElim algorithms. The experimental results show that TRICE outperforms all these algorithms both in terms of running time and memory consumption. TRICE maintains a substantial performance gap for all sparse real-world datasets on all minimum support thresholds. Moreover, assessment of memory use of optimized SaM and RElim algorithms has been performed for the first time.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2019.2959878</doi><tpages>18</tpages><orcidid>https://orcid.org/0000-0002-6366-8230</orcidid><orcidid>https://orcid.org/0000-0003-3074-9162</orcidid><orcidid>https://orcid.org/0000-0003-2211-8360</orcidid><orcidid>https://orcid.org/0000-0001-8781-4143</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2169-3536
ispartof IEEE access, 2019, Vol.7, p.181688-181705
issn 2169-3536
2169-3536
language eng
recordid cdi_proquest_journals_2455596367
source IEEE Open Access Journals; DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals
subjects Algorithms
Association rules
Big Data
big data applications
Data mining
Datasets
frequent itemset mining
Information technology
Itemsets
Iterative methods
Lattices
Machine learning
Memory management
pattern recognition
pervasive computing
Run time (computers)
Ubiquitous computing
title TRICE: Mining Frequent Itemsets by Iterative TRimmed Transaction LattICE in Sparse Big Data
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-21T18%3A36%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=TRICE:%20Mining%20Frequent%20Itemsets%20by%20Iterative%20TRimmed%20Transaction%20LattICE%20in%20Sparse%20Big%20Data&rft.jtitle=IEEE%20access&rft.au=Yasir,%20Muhammad&rft.date=2019&rft.volume=7&rft.spage=181688&rft.epage=181705&rft.pages=181688-181705&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2019.2959878&rft_dat=%3Cproquest_doaj_%3E2455596367%3C/proquest_doaj_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2455596367&rft_id=info:pmid/&rft_ieee_id=8933017&rft_doaj_id=oai_doaj_org_article_f92c0bc7a7cc42adb5ac14925afcf09f&rfr_iscdi=true