Discovering expressive process models by clustering log traces

Process mining techniques have recently received notable attention in the literature; for their ability to assist in the (re)design of complex processes by automatically discovering models that explain the events registered in some log traces provided as input. Following this line of research, the p...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on knowledge and data engineering 2006-08, Vol.18 (8), p.1010-1027
Hauptverfasser: Greco, G., Guzzo, A., Pontieri, L., Sacca, D.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1027
container_issue 8
container_start_page 1010
container_title IEEE transactions on knowledge and data engineering
container_volume 18
creator Greco, G.
Guzzo, A.
Pontieri, L.
Sacca, D.
description Process mining techniques have recently received notable attention in the literature; for their ability to assist in the (re)design of complex processes by automatically discovering models that explain the events registered in some log traces provided as input. Following this line of research, the paper investigates an extension of such basic approaches, where the identification of different variants for the process is explicitly accounted for, based on the clustering of log traces. Indeed, modeling each group of similar executions with a different schema allows us to single out "conformant" models, which, specifically, minimize the number of modeled enactments that are extraneous to the process semantics. Therefore, a novel process mining framework is introduced and some relevant computational issues are deeply studied. As finding an exact solution to such an enhanced process mining problem is proven to require high computational costs, in most practical cases, a greedy approach is devised. This is founded on an iterative, hierarchical, refinement of the process model, where, at each step, traces sharing similar behavior patterns are clustered together and equipped with a specialized schema. The algorithm guarantees that each refinement leads to an increasingly sound mDdel, thus attaining a monotonic search. Experimental results evidence the validity of the approach with respect to both effectiveness and scalability.
doi_str_mv 10.1109/TKDE.2006.123
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_miscellaneous_1671383891</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1644726</ieee_id><sourcerecordid>1671383891</sourcerecordid><originalsourceid>FETCH-LOGICAL-c391t-798c41b4f0f2fe3c47f0d72096be9b9240627fb1eed8f2714f2e0c2f6c7225bb3</originalsourceid><addsrcrecordid>eNpd0M9LwzAUB_AgCs7p0ZOXIgheOvNe06a5CLLNHzjwMs-hzV5GR9fOpBvuvzejg4GnPMjnPb58GbsFPgLg6mn-OZmOkPNsBJicsQGkaR4jKDgPMxcQi0TIS3bl_YpznsscBux5UnnT7shVzTKi340j76sdRRvXmjBG63ZBtY_KfWTqre96V7fLqHNFANfswha1p5vjO2Tfr9P5-D2efb19jF9msUkUdLFUuRFQCsstWkqMkJYvJHKVlaRKhYJnKG0JRIvcogRhkbhBmxmJmJZlMmSP_d2Q62dLvtPrkJvqumio3XoNmYQkT3IFgd7_o6t265qQTitAQBQ5BhT3yLjWe0dWb1y1LtxeA9eHMvWhTH0oU4cyg384Hi28KWrrisZU_rQkVSoEyuDuelcR0ek7E0JilvwB7FV8Mw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>912122482</pqid></control><display><type>article</type><title>Discovering expressive process models by clustering log traces</title><source>IEEE Electronic Library (IEL)</source><creator>Greco, G. ; Guzzo, A. ; Pontieri, L. ; Sacca, D.</creator><creatorcontrib>Greco, G. ; Guzzo, A. ; Pontieri, L. ; Sacca, D.</creatorcontrib><description>Process mining techniques have recently received notable attention in the literature; for their ability to assist in the (re)design of complex processes by automatically discovering models that explain the events registered in some log traces provided as input. Following this line of research, the paper investigates an extension of such basic approaches, where the identification of different variants for the process is explicitly accounted for, based on the clustering of log traces. Indeed, modeling each group of similar executions with a different schema allows us to single out "conformant" models, which, specifically, minimize the number of modeled enactments that are extraneous to the process semantics. Therefore, a novel process mining framework is introduced and some relevant computational issues are deeply studied. As finding an exact solution to such an enhanced process mining problem is proven to require high computational costs, in most practical cases, a greedy approach is devised. This is founded on an iterative, hierarchical, refinement of the process model, where, at each step, traces sharing similar behavior patterns are clustered together and equipped with a specialized schema. The algorithm guarantees that each refinement leads to an increasingly sound mDdel, thus attaining a monotonic search. Experimental results evidence the validity of the approach with respect to both effectiveness and scalability.</description><identifier>ISSN: 1041-4347</identifier><identifier>EISSN: 1558-2191</identifier><identifier>DOI: 10.1109/TKDE.2006.123</identifier><identifier>CODEN: ITKEEH</identifier><language>eng</language><publisher>New York, NY: IEEE</publisher><subject>Algorithms ; Applied sciences ; association rules ; classification ; Clustering ; Clustering algorithms ; Companies ; Computational efficiency ; Computer science; control theory; systems ; Computer Society ; Customer relationship management ; Data mining ; Data processing. List processing. Character string processing ; Enterprise resource planning ; Exact sciences and technology ; Exact solutions ; Information systems. Data bases ; Iterative algorithms ; Management information systems ; Mathematical analysis ; Mathematical models ; Memory organisation. Data processing ; Mining ; Process mining ; Semantics ; Software ; Studies ; Supply chain management ; workflow management</subject><ispartof>IEEE transactions on knowledge and data engineering, 2006-08, Vol.18 (8), p.1010-1027</ispartof><rights>2006 INIST-CNRS</rights><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2006</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c391t-798c41b4f0f2fe3c47f0d72096be9b9240627fb1eed8f2714f2e0c2f6c7225bb3</citedby><cites>FETCH-LOGICAL-c391t-798c41b4f0f2fe3c47f0d72096be9b9240627fb1eed8f2714f2e0c2f6c7225bb3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1644726$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,778,782,794,27907,27908,54741</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1644726$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=17954427$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Greco, G.</creatorcontrib><creatorcontrib>Guzzo, A.</creatorcontrib><creatorcontrib>Pontieri, L.</creatorcontrib><creatorcontrib>Sacca, D.</creatorcontrib><title>Discovering expressive process models by clustering log traces</title><title>IEEE transactions on knowledge and data engineering</title><addtitle>TKDE</addtitle><description>Process mining techniques have recently received notable attention in the literature; for their ability to assist in the (re)design of complex processes by automatically discovering models that explain the events registered in some log traces provided as input. Following this line of research, the paper investigates an extension of such basic approaches, where the identification of different variants for the process is explicitly accounted for, based on the clustering of log traces. Indeed, modeling each group of similar executions with a different schema allows us to single out "conformant" models, which, specifically, minimize the number of modeled enactments that are extraneous to the process semantics. Therefore, a novel process mining framework is introduced and some relevant computational issues are deeply studied. As finding an exact solution to such an enhanced process mining problem is proven to require high computational costs, in most practical cases, a greedy approach is devised. This is founded on an iterative, hierarchical, refinement of the process model, where, at each step, traces sharing similar behavior patterns are clustered together and equipped with a specialized schema. The algorithm guarantees that each refinement leads to an increasingly sound mDdel, thus attaining a monotonic search. Experimental results evidence the validity of the approach with respect to both effectiveness and scalability.</description><subject>Algorithms</subject><subject>Applied sciences</subject><subject>association rules</subject><subject>classification</subject><subject>Clustering</subject><subject>Clustering algorithms</subject><subject>Companies</subject><subject>Computational efficiency</subject><subject>Computer science; control theory; systems</subject><subject>Computer Society</subject><subject>Customer relationship management</subject><subject>Data mining</subject><subject>Data processing. List processing. Character string processing</subject><subject>Enterprise resource planning</subject><subject>Exact sciences and technology</subject><subject>Exact solutions</subject><subject>Information systems. Data bases</subject><subject>Iterative algorithms</subject><subject>Management information systems</subject><subject>Mathematical analysis</subject><subject>Mathematical models</subject><subject>Memory organisation. Data processing</subject><subject>Mining</subject><subject>Process mining</subject><subject>Semantics</subject><subject>Software</subject><subject>Studies</subject><subject>Supply chain management</subject><subject>workflow management</subject><issn>1041-4347</issn><issn>1558-2191</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2006</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpd0M9LwzAUB_AgCs7p0ZOXIgheOvNe06a5CLLNHzjwMs-hzV5GR9fOpBvuvzejg4GnPMjnPb58GbsFPgLg6mn-OZmOkPNsBJicsQGkaR4jKDgPMxcQi0TIS3bl_YpznsscBux5UnnT7shVzTKi340j76sdRRvXmjBG63ZBtY_KfWTqre96V7fLqHNFANfswha1p5vjO2Tfr9P5-D2efb19jF9msUkUdLFUuRFQCsstWkqMkJYvJHKVlaRKhYJnKG0JRIvcogRhkbhBmxmJmJZlMmSP_d2Q62dLvtPrkJvqumio3XoNmYQkT3IFgd7_o6t265qQTitAQBQ5BhT3yLjWe0dWb1y1LtxeA9eHMvWhTH0oU4cyg384Hi28KWrrisZU_rQkVSoEyuDuelcR0ek7E0JilvwB7FV8Mw</recordid><startdate>20060801</startdate><enddate>20060801</enddate><creator>Greco, G.</creator><creator>Guzzo, A.</creator><creator>Pontieri, L.</creator><creator>Sacca, D.</creator><general>IEEE</general><general>IEEE Computer Society</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20060801</creationdate><title>Discovering expressive process models by clustering log traces</title><author>Greco, G. ; Guzzo, A. ; Pontieri, L. ; Sacca, D.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c391t-798c41b4f0f2fe3c47f0d72096be9b9240627fb1eed8f2714f2e0c2f6c7225bb3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Algorithms</topic><topic>Applied sciences</topic><topic>association rules</topic><topic>classification</topic><topic>Clustering</topic><topic>Clustering algorithms</topic><topic>Companies</topic><topic>Computational efficiency</topic><topic>Computer science; control theory; systems</topic><topic>Computer Society</topic><topic>Customer relationship management</topic><topic>Data mining</topic><topic>Data processing. List processing. Character string processing</topic><topic>Enterprise resource planning</topic><topic>Exact sciences and technology</topic><topic>Exact solutions</topic><topic>Information systems. Data bases</topic><topic>Iterative algorithms</topic><topic>Management information systems</topic><topic>Mathematical analysis</topic><topic>Mathematical models</topic><topic>Memory organisation. Data processing</topic><topic>Mining</topic><topic>Process mining</topic><topic>Semantics</topic><topic>Software</topic><topic>Studies</topic><topic>Supply chain management</topic><topic>workflow management</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Greco, G.</creatorcontrib><creatorcontrib>Guzzo, A.</creatorcontrib><creatorcontrib>Pontieri, L.</creatorcontrib><creatorcontrib>Sacca, D.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on knowledge and data engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Greco, G.</au><au>Guzzo, A.</au><au>Pontieri, L.</au><au>Sacca, D.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Discovering expressive process models by clustering log traces</atitle><jtitle>IEEE transactions on knowledge and data engineering</jtitle><stitle>TKDE</stitle><date>2006-08-01</date><risdate>2006</risdate><volume>18</volume><issue>8</issue><spage>1010</spage><epage>1027</epage><pages>1010-1027</pages><issn>1041-4347</issn><eissn>1558-2191</eissn><coden>ITKEEH</coden><abstract>Process mining techniques have recently received notable attention in the literature; for their ability to assist in the (re)design of complex processes by automatically discovering models that explain the events registered in some log traces provided as input. Following this line of research, the paper investigates an extension of such basic approaches, where the identification of different variants for the process is explicitly accounted for, based on the clustering of log traces. Indeed, modeling each group of similar executions with a different schema allows us to single out "conformant" models, which, specifically, minimize the number of modeled enactments that are extraneous to the process semantics. Therefore, a novel process mining framework is introduced and some relevant computational issues are deeply studied. As finding an exact solution to such an enhanced process mining problem is proven to require high computational costs, in most practical cases, a greedy approach is devised. This is founded on an iterative, hierarchical, refinement of the process model, where, at each step, traces sharing similar behavior patterns are clustered together and equipped with a specialized schema. The algorithm guarantees that each refinement leads to an increasingly sound mDdel, thus attaining a monotonic search. Experimental results evidence the validity of the approach with respect to both effectiveness and scalability.</abstract><cop>New York, NY</cop><pub>IEEE</pub><doi>10.1109/TKDE.2006.123</doi><tpages>18</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1041-4347
ispartof IEEE transactions on knowledge and data engineering, 2006-08, Vol.18 (8), p.1010-1027
issn 1041-4347
1558-2191
language eng
recordid cdi_proquest_miscellaneous_1671383891
source IEEE Electronic Library (IEL)
subjects Algorithms
Applied sciences
association rules
classification
Clustering
Clustering algorithms
Companies
Computational efficiency
Computer science
control theory
systems
Computer Society
Customer relationship management
Data mining
Data processing. List processing. Character string processing
Enterprise resource planning
Exact sciences and technology
Exact solutions
Information systems. Data bases
Iterative algorithms
Management information systems
Mathematical analysis
Mathematical models
Memory organisation. Data processing
Mining
Process mining
Semantics
Software
Studies
Supply chain management
workflow management
title Discovering expressive process models by clustering log traces
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T05%3A59%3A03IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Discovering%20expressive%20process%20models%20by%20clustering%20log%20traces&rft.jtitle=IEEE%20transactions%20on%20knowledge%20and%20data%20engineering&rft.au=Greco,%20G.&rft.date=2006-08-01&rft.volume=18&rft.issue=8&rft.spage=1010&rft.epage=1027&rft.pages=1010-1027&rft.issn=1041-4347&rft.eissn=1558-2191&rft.coden=ITKEEH&rft_id=info:doi/10.1109/TKDE.2006.123&rft_dat=%3Cproquest_RIE%3E1671383891%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=912122482&rft_id=info:pmid/&rft_ieee_id=1644726&rfr_iscdi=true