Discovering expressive process models by clustering log traces
Process mining techniques have recently received notable attention in the literature; for their ability to assist in the (re)design of complex processes by automatically discovering models that explain the events registered in some log traces provided as input. Following this line of research, the p...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on knowledge and data engineering 2006-08, Vol.18 (8), p.1010-1027 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1027 |
---|---|
container_issue | 8 |
container_start_page | 1010 |
container_title | IEEE transactions on knowledge and data engineering |
container_volume | 18 |
creator | Greco, G. Guzzo, A. Pontieri, L. Sacca, D. |
description | Process mining techniques have recently received notable attention in the literature; for their ability to assist in the (re)design of complex processes by automatically discovering models that explain the events registered in some log traces provided as input. Following this line of research, the paper investigates an extension of such basic approaches, where the identification of different variants for the process is explicitly accounted for, based on the clustering of log traces. Indeed, modeling each group of similar executions with a different schema allows us to single out "conformant" models, which, specifically, minimize the number of modeled enactments that are extraneous to the process semantics. Therefore, a novel process mining framework is introduced and some relevant computational issues are deeply studied. As finding an exact solution to such an enhanced process mining problem is proven to require high computational costs, in most practical cases, a greedy approach is devised. This is founded on an iterative, hierarchical, refinement of the process model, where, at each step, traces sharing similar behavior patterns are clustered together and equipped with a specialized schema. The algorithm guarantees that each refinement leads to an increasingly sound mDdel, thus attaining a monotonic search. Experimental results evidence the validity of the approach with respect to both effectiveness and scalability. |
doi_str_mv | 10.1109/TKDE.2006.123 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_miscellaneous_1671383891</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1644726</ieee_id><sourcerecordid>1671383891</sourcerecordid><originalsourceid>FETCH-LOGICAL-c391t-798c41b4f0f2fe3c47f0d72096be9b9240627fb1eed8f2714f2e0c2f6c7225bb3</originalsourceid><addsrcrecordid>eNpd0M9LwzAUB_AgCs7p0ZOXIgheOvNe06a5CLLNHzjwMs-hzV5GR9fOpBvuvzejg4GnPMjnPb58GbsFPgLg6mn-OZmOkPNsBJicsQGkaR4jKDgPMxcQi0TIS3bl_YpznsscBux5UnnT7shVzTKi340j76sdRRvXmjBG63ZBtY_KfWTqre96V7fLqHNFANfswha1p5vjO2Tfr9P5-D2efb19jF9msUkUdLFUuRFQCsstWkqMkJYvJHKVlaRKhYJnKG0JRIvcogRhkbhBmxmJmJZlMmSP_d2Q62dLvtPrkJvqumio3XoNmYQkT3IFgd7_o6t265qQTitAQBQ5BhT3yLjWe0dWb1y1LtxeA9eHMvWhTH0oU4cyg384Hi28KWrrisZU_rQkVSoEyuDuelcR0ek7E0JilvwB7FV8Mw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>912122482</pqid></control><display><type>article</type><title>Discovering expressive process models by clustering log traces</title><source>IEEE Electronic Library (IEL)</source><creator>Greco, G. ; Guzzo, A. ; Pontieri, L. ; Sacca, D.</creator><creatorcontrib>Greco, G. ; Guzzo, A. ; Pontieri, L. ; Sacca, D.</creatorcontrib><description>Process mining techniques have recently received notable attention in the literature; for their ability to assist in the (re)design of complex processes by automatically discovering models that explain the events registered in some log traces provided as input. Following this line of research, the paper investigates an extension of such basic approaches, where the identification of different variants for the process is explicitly accounted for, based on the clustering of log traces. Indeed, modeling each group of similar executions with a different schema allows us to single out "conformant" models, which, specifically, minimize the number of modeled enactments that are extraneous to the process semantics. Therefore, a novel process mining framework is introduced and some relevant computational issues are deeply studied. As finding an exact solution to such an enhanced process mining problem is proven to require high computational costs, in most practical cases, a greedy approach is devised. This is founded on an iterative, hierarchical, refinement of the process model, where, at each step, traces sharing similar behavior patterns are clustered together and equipped with a specialized schema. The algorithm guarantees that each refinement leads to an increasingly sound mDdel, thus attaining a monotonic search. Experimental results evidence the validity of the approach with respect to both effectiveness and scalability.</description><identifier>ISSN: 1041-4347</identifier><identifier>EISSN: 1558-2191</identifier><identifier>DOI: 10.1109/TKDE.2006.123</identifier><identifier>CODEN: ITKEEH</identifier><language>eng</language><publisher>New York, NY: IEEE</publisher><subject>Algorithms ; Applied sciences ; association rules ; classification ; Clustering ; Clustering algorithms ; Companies ; Computational efficiency ; Computer science; control theory; systems ; Computer Society ; Customer relationship management ; Data mining ; Data processing. List processing. Character string processing ; Enterprise resource planning ; Exact sciences and technology ; Exact solutions ; Information systems. Data bases ; Iterative algorithms ; Management information systems ; Mathematical analysis ; Mathematical models ; Memory organisation. Data processing ; Mining ; Process mining ; Semantics ; Software ; Studies ; Supply chain management ; workflow management</subject><ispartof>IEEE transactions on knowledge and data engineering, 2006-08, Vol.18 (8), p.1010-1027</ispartof><rights>2006 INIST-CNRS</rights><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2006</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c391t-798c41b4f0f2fe3c47f0d72096be9b9240627fb1eed8f2714f2e0c2f6c7225bb3</citedby><cites>FETCH-LOGICAL-c391t-798c41b4f0f2fe3c47f0d72096be9b9240627fb1eed8f2714f2e0c2f6c7225bb3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1644726$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,778,782,794,27907,27908,54741</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1644726$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=17954427$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Greco, G.</creatorcontrib><creatorcontrib>Guzzo, A.</creatorcontrib><creatorcontrib>Pontieri, L.</creatorcontrib><creatorcontrib>Sacca, D.</creatorcontrib><title>Discovering expressive process models by clustering log traces</title><title>IEEE transactions on knowledge and data engineering</title><addtitle>TKDE</addtitle><description>Process mining techniques have recently received notable attention in the literature; for their ability to assist in the (re)design of complex processes by automatically discovering models that explain the events registered in some log traces provided as input. Following this line of research, the paper investigates an extension of such basic approaches, where the identification of different variants for the process is explicitly accounted for, based on the clustering of log traces. Indeed, modeling each group of similar executions with a different schema allows us to single out "conformant" models, which, specifically, minimize the number of modeled enactments that are extraneous to the process semantics. Therefore, a novel process mining framework is introduced and some relevant computational issues are deeply studied. As finding an exact solution to such an enhanced process mining problem is proven to require high computational costs, in most practical cases, a greedy approach is devised. This is founded on an iterative, hierarchical, refinement of the process model, where, at each step, traces sharing similar behavior patterns are clustered together and equipped with a specialized schema. The algorithm guarantees that each refinement leads to an increasingly sound mDdel, thus attaining a monotonic search. Experimental results evidence the validity of the approach with respect to both effectiveness and scalability.</description><subject>Algorithms</subject><subject>Applied sciences</subject><subject>association rules</subject><subject>classification</subject><subject>Clustering</subject><subject>Clustering algorithms</subject><subject>Companies</subject><subject>Computational efficiency</subject><subject>Computer science; control theory; systems</subject><subject>Computer Society</subject><subject>Customer relationship management</subject><subject>Data mining</subject><subject>Data processing. List processing. Character string processing</subject><subject>Enterprise resource planning</subject><subject>Exact sciences and technology</subject><subject>Exact solutions</subject><subject>Information systems. Data bases</subject><subject>Iterative algorithms</subject><subject>Management information systems</subject><subject>Mathematical analysis</subject><subject>Mathematical models</subject><subject>Memory organisation. Data processing</subject><subject>Mining</subject><subject>Process mining</subject><subject>Semantics</subject><subject>Software</subject><subject>Studies</subject><subject>Supply chain management</subject><subject>workflow management</subject><issn>1041-4347</issn><issn>1558-2191</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2006</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpd0M9LwzAUB_AgCs7p0ZOXIgheOvNe06a5CLLNHzjwMs-hzV5GR9fOpBvuvzejg4GnPMjnPb58GbsFPgLg6mn-OZmOkPNsBJicsQGkaR4jKDgPMxcQi0TIS3bl_YpznsscBux5UnnT7shVzTKi340j76sdRRvXmjBG63ZBtY_KfWTqre96V7fLqHNFANfswha1p5vjO2Tfr9P5-D2efb19jF9msUkUdLFUuRFQCsstWkqMkJYvJHKVlaRKhYJnKG0JRIvcogRhkbhBmxmJmJZlMmSP_d2Q62dLvtPrkJvqumio3XoNmYQkT3IFgd7_o6t265qQTitAQBQ5BhT3yLjWe0dWb1y1LtxeA9eHMvWhTH0oU4cyg384Hi28KWrrisZU_rQkVSoEyuDuelcR0ek7E0JilvwB7FV8Mw</recordid><startdate>20060801</startdate><enddate>20060801</enddate><creator>Greco, G.</creator><creator>Guzzo, A.</creator><creator>Pontieri, L.</creator><creator>Sacca, D.</creator><general>IEEE</general><general>IEEE Computer Society</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20060801</creationdate><title>Discovering expressive process models by clustering log traces</title><author>Greco, G. ; Guzzo, A. ; Pontieri, L. ; Sacca, D.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c391t-798c41b4f0f2fe3c47f0d72096be9b9240627fb1eed8f2714f2e0c2f6c7225bb3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Algorithms</topic><topic>Applied sciences</topic><topic>association rules</topic><topic>classification</topic><topic>Clustering</topic><topic>Clustering algorithms</topic><topic>Companies</topic><topic>Computational efficiency</topic><topic>Computer science; control theory; systems</topic><topic>Computer Society</topic><topic>Customer relationship management</topic><topic>Data mining</topic><topic>Data processing. List processing. Character string processing</topic><topic>Enterprise resource planning</topic><topic>Exact sciences and technology</topic><topic>Exact solutions</topic><topic>Information systems. Data bases</topic><topic>Iterative algorithms</topic><topic>Management information systems</topic><topic>Mathematical analysis</topic><topic>Mathematical models</topic><topic>Memory organisation. Data processing</topic><topic>Mining</topic><topic>Process mining</topic><topic>Semantics</topic><topic>Software</topic><topic>Studies</topic><topic>Supply chain management</topic><topic>workflow management</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Greco, G.</creatorcontrib><creatorcontrib>Guzzo, A.</creatorcontrib><creatorcontrib>Pontieri, L.</creatorcontrib><creatorcontrib>Sacca, D.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on knowledge and data engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Greco, G.</au><au>Guzzo, A.</au><au>Pontieri, L.</au><au>Sacca, D.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Discovering expressive process models by clustering log traces</atitle><jtitle>IEEE transactions on knowledge and data engineering</jtitle><stitle>TKDE</stitle><date>2006-08-01</date><risdate>2006</risdate><volume>18</volume><issue>8</issue><spage>1010</spage><epage>1027</epage><pages>1010-1027</pages><issn>1041-4347</issn><eissn>1558-2191</eissn><coden>ITKEEH</coden><abstract>Process mining techniques have recently received notable attention in the literature; for their ability to assist in the (re)design of complex processes by automatically discovering models that explain the events registered in some log traces provided as input. Following this line of research, the paper investigates an extension of such basic approaches, where the identification of different variants for the process is explicitly accounted for, based on the clustering of log traces. Indeed, modeling each group of similar executions with a different schema allows us to single out "conformant" models, which, specifically, minimize the number of modeled enactments that are extraneous to the process semantics. Therefore, a novel process mining framework is introduced and some relevant computational issues are deeply studied. As finding an exact solution to such an enhanced process mining problem is proven to require high computational costs, in most practical cases, a greedy approach is devised. This is founded on an iterative, hierarchical, refinement of the process model, where, at each step, traces sharing similar behavior patterns are clustered together and equipped with a specialized schema. The algorithm guarantees that each refinement leads to an increasingly sound mDdel, thus attaining a monotonic search. Experimental results evidence the validity of the approach with respect to both effectiveness and scalability.</abstract><cop>New York, NY</cop><pub>IEEE</pub><doi>10.1109/TKDE.2006.123</doi><tpages>18</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1041-4347 |
ispartof | IEEE transactions on knowledge and data engineering, 2006-08, Vol.18 (8), p.1010-1027 |
issn | 1041-4347 1558-2191 |
language | eng |
recordid | cdi_proquest_miscellaneous_1671383891 |
source | IEEE Electronic Library (IEL) |
subjects | Algorithms Applied sciences association rules classification Clustering Clustering algorithms Companies Computational efficiency Computer science control theory systems Computer Society Customer relationship management Data mining Data processing. List processing. Character string processing Enterprise resource planning Exact sciences and technology Exact solutions Information systems. Data bases Iterative algorithms Management information systems Mathematical analysis Mathematical models Memory organisation. Data processing Mining Process mining Semantics Software Studies Supply chain management workflow management |
title | Discovering expressive process models by clustering log traces |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T05%3A59%3A03IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Discovering%20expressive%20process%20models%20by%20clustering%20log%20traces&rft.jtitle=IEEE%20transactions%20on%20knowledge%20and%20data%20engineering&rft.au=Greco,%20G.&rft.date=2006-08-01&rft.volume=18&rft.issue=8&rft.spage=1010&rft.epage=1027&rft.pages=1010-1027&rft.issn=1041-4347&rft.eissn=1558-2191&rft.coden=ITKEEH&rft_id=info:doi/10.1109/TKDE.2006.123&rft_dat=%3Cproquest_RIE%3E1671383891%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=912122482&rft_id=info:pmid/&rft_ieee_id=1644726&rfr_iscdi=true |