Discovering expressive process models by clustering log traces

Process mining techniques have recently received notable attention in the literature; for their ability to assist in the (re)design of complex processes by automatically discovering models that explain the events registered in some log traces provided as input. Following this line of research, the p...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on knowledge and data engineering 2006-08, Vol.18 (8), p.1010-1027
Hauptverfasser:	Greco, G., Guzzo, A., Pontieri, L., Sacca, D.
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Applied sciences association rules classification Clustering Clustering algorithms Companies Computational efficiency Computer science control theory systems Computer Society Customer relationship management Data mining Data processing. List processing. Character string processing Enterprise resource planning Exact sciences and technology Exact solutions Information systems. Data bases Iterative algorithms Management information systems Mathematical analysis Mathematical models Memory organisation. Data processing Mining Process mining Semantics Software Studies Supply chain management workflow management
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1027
container_issue	8
container_start_page	1010
container_title	IEEE transactions on knowledge and data engineering
container_volume	18
creator	Greco, G. Guzzo, A. Pontieri, L. Sacca, D.
description	Process mining techniques have recently received notable attention in the literature; for their ability to assist in the (re)design of complex processes by automatically discovering models that explain the events registered in some log traces provided as input. Following this line of research, the paper investigates an extension of such basic approaches, where the identification of different variants for the process is explicitly accounted for, based on the clustering of log traces. Indeed, modeling each group of similar executions with a different schema allows us to single out "conformant" models, which, specifically, minimize the number of modeled enactments that are extraneous to the process semantics. Therefore, a novel process mining framework is introduced and some relevant computational issues are deeply studied. As finding an exact solution to such an enhanced process mining problem is proven to require high computational costs, in most practical cases, a greedy approach is devised. This is founded on an iterative, hierarchical, refinement of the process model, where, at each step, traces sharing similar behavior patterns are clustered together and equipped with a specialized schema. The algorithm guarantees that each refinement leads to an increasingly sound mDdel, thus attaining a monotonic search. Experimental results evidence the validity of the approach with respect to both effectiveness and scalability.
doi_str_mv	10.1109/TKDE.2006.123
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_miscellaneous_1671383891</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1644726</ieee_id><sourcerecordid>1671383891</sourcerecordid><originalsourceid>FETCH-LOGICAL-c391t-798c41b4f0f2fe3c47f0d72096be9b9240627fb1eed8f2714f2e0c2f6c7225bb3</originalsourceid><addsrcrecordid>eNpd0M9LwzAUB_AgCs7p0ZOXIgheOvNe06a5CLLNHzjwMs-hzV5GR9fOpBvuvzejg4GnPMjnPb58GbsFPgLg6mn-OZmOkPNsBJicsQGkaR4jKDgPMxcQi0TIS3bl_YpznsscBux5UnnT7shVzTKi340j76sdRRvXmjBG63ZBtY_KfWTqre96V7fLqHNFANfswha1p5vjO2Tfr9P5-D2efb19jF9msUkUdLFUuRFQCsstWkqMkJYvJHKVlaRKhYJnKG0JRIvcogRhkbhBmxmJmJZlMmSP_d2Q62dLvtPrkJvqumio3XoNmYQkT3IFgd7_o6t265qQTitAQBQ5BhT3yLjWe0dWb1y1LtxeA9eHMvWhTH0oU4cyg384Hi28KWrrisZU_rQkVSoEyuDuelcR0ek7E0JilvwB7FV8Mw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>912122482</pqid></control><display><type>article</type><title>Discovering expressive process models by clustering log traces</title><source>IEEE Electronic Library (IEL)</source><creator>Greco, G. ; Guzzo, A. ; Pontieri, L. ; Sacca, D.</creator><creatorcontrib>Greco, G. ; Guzzo, A. ; Pontieri, L. ; Sacca, D.</creatorcontrib><description>Process mining techniques have recently received notable attention in the literature; for their ability to assist in the (re)design of complex processes by automatically discovering models that explain the events registered in some log traces provided as input. Following this line of research, the paper investigates an extension of such basic approaches, where the identification of different variants for the process is explicitly accounted for, based on the clustering of log traces. Indeed, modeling each group of similar executions with a different schema allows us to single out "conformant" models, which, specifically, minimize the number of modeled enactments that are extraneous to the process semantics. Therefore, a novel process mining framework is introduced and some relevant computational issues are deeply studied. As finding an exact solution to such an enhanced process mining problem is proven to require high computational costs, in most practical cases, a greedy approach is devised. This is founded on an iterative, hierarchical, refinement of the process model, where, at each step, traces sharing similar behavior patterns are clustered together and equipped with a specialized schema. The algorithm guarantees that each refinement leads to an increasingly sound mDdel, thus attaining a monotonic search. Experimental results evidence the validity of the approach with respect to both effectiveness and scalability.</description><identifier>ISSN: 1041-4347</identifier><identifier>EISSN: 1558-2191</identifier><identifier>DOI: 10.1109/TKDE.2006.123</identifier><identifier>CODEN: ITKEEH</identifier><language>eng</language><publisher>New York, NY: IEEE</publisher><subject>Algorithms ; Applied sciences ; association rules ; classification ; Clustering ; Clustering algorithms ; Companies ; Computational efficiency ; Computer science; control theory; systems ; Computer Society ; Customer relationship management ; Data mining ; Data processing. List processing. Character string processing ; Enterprise resource planning ; Exact sciences and technology ; Exact solutions ; Information systems. Data bases ; Iterative algorithms ; Management information systems ; Mathematical analysis ; Mathematical models ; Memory organisation. Data processing ; Mining ; Process mining ; Semantics ; Software ; Studies ; Supply chain management ; workflow management</subject><ispartof>IEEE transactions on knowledge and data engineering, 2006-08, Vol.18 (8), p.1010-1027</ispartof><rights>2006 INIST-CNRS</rights><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2006</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c391t-798c41b4f0f2fe3c47f0d72096be9b9240627fb1eed8f2714f2e0c2f6c7225bb3</citedby><cites>FETCH-LOGICAL-c391t-798c41b4f0f2fe3c47f0d72096be9b9240627fb1eed8f2714f2e0c2f6c7225bb3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1644726$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,778,782,794,27907,27908,54741</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1644726$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=17954427$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Greco, G.</creatorcontrib><creatorcontrib>Guzzo, A.</creatorcontrib><creatorcontrib>Pontieri, L.</creatorcontrib><creatorcontrib>Sacca, D.</creatorcontrib><title>Discovering expressive process models by clustering log traces</title><title>IEEE transactions on knowledge and data engineering</title><addtitle>TKDE</addtitle><description>Process mining techniques have recently received notable attention in the literature; for their ability to assist in the (re)design of complex processes by automatically discovering models that explain the events registered in some log traces provided as input. Following this line of research, the paper investigates an extension of such basic approaches, where the identification of different variants for the process is explicitly accounted for, based on the clustering of log traces. Indeed, modeling each group of similar executions with a different schema allows us to single out "conformant" models, which, specifically, minimize the number of modeled enactments that are extraneous to the process semantics. Therefore, a novel process mining framework is introduced and some relevant computational issues are deeply studied. As finding an exact solution to such an enhanced process mining problem is proven to require high computational costs, in most practical cases, a greedy approach is devised. This is founded on an iterative, hierarchical, refinement of the process model, where, at each step, traces sharing similar behavior patterns are clustered together and equipped with a specialized schema. The algorithm guarantees that each refinement leads to an increasingly sound mDdel, thus attaining a monotonic search. Experimental results evidence the validity of the approach with respect to both effectiveness and scalability.</description><subject>Algorithms</subject><subject>Applied sciences</subject><subject>association rules</subject><subject>classification</subject><subject>Clustering</subject><subject>Clustering algorithms</subject><subject>Companies</subject><subject>Computational efficiency</subject><subject>Computer science; control theory; systems</subject><subject>Computer Society</subject><subject>Customer relationship management</subject><subject>Data mining</subject><subject>Data processing. List processing. Character string processing</subject><subject>Enterprise resource planning</subject><subject>Exact sciences and technology</subject><subject>Exact solutions</subject><subject>Information systems. Data bases</subject><subject>Iterative algorithms</subject><subject>Management information systems</subject><subject>Mathematical analysis</subject><subject>Mathematical models</subject><subject>Memory organisation. Data processing</subject><subject>Mining</subject><subject>Process mining</subject><subject>Semantics</subject><subject>Software</subject><subject>Studies</subject><subject>Supply chain management</subject><subject>workflow management</subject><issn>1041-4347</issn><issn>1558-2191</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2006</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpd0M9LwzAUB_AgCs7p0ZOXIgheOvNe06a5CLLNHzjwMs-hzV5GR9fOpBvuvzejg4GnPMjnPb58GbsFPgLg6mn-OZmOkPNsBJicsQGkaR4jKDgPMxcQi0TIS3bl_YpznsscBux5UnnT7shVzTKi340j76sdRRvXmjBG63ZBtY_KfWTqre96V7fLqHNFANfswha1p5vjO2Tfr9P5-D2efb19jF9msUkUdLFUuRFQCsstWkqMkJYvJHKVlaRKhYJnKG0JRIvcogRhkbhBmxmJmJZlMmSP_d2Q62dLvtPrkJvqumio3XoNmYQkT3IFgd7_o6t265qQTitAQBQ5BhT3yLjWe0dWb1y1LtxeA9eHMvWhTH0oU4cyg384Hi28KWrrisZU_rQkVSoEyuDuelcR0ek7E0JilvwB7FV8Mw</recordid><startdate>20060801</startdate><enddate>20060801</enddate><creator>Greco, G.</creator><creator>Guzzo, A.</creator><creator>Pontieri, L.</creator><creator>Sacca, D.</creator><general>IEEE</general><general>IEEE Computer Society</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20060801</creationdate><title>Discovering expressive process models by clustering log traces</title><author>Greco, G. ; Guzzo, A. ; Pontieri, L. ; Sacca, D.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c391t-798c41b4f0f2fe3c47f0d72096be9b9240627fb1eed8f2714f2e0c2f6c7225bb3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Algorithms</topic><topic>Applied sciences</topic><topic>association rules</topic><topic>classification</topic><topic>Clustering</topic><topic>Clustering algorithms</topic><topic>Companies</topic><topic>Computational efficiency</topic><topic>Computer science; control theory; systems</topic><topic>Computer Society</topic><topic>Customer relationship management</topic><topic>Data mining</topic><topic>Data processing. List processing. Character string processing</topic><topic>Enterprise resource planning</topic><topic>Exact sciences and technology</topic><topic>Exact solutions</topic><topic>Information systems. Data bases</topic><topic>Iterative algorithms</topic><topic>Management information systems</topic><topic>Mathematical analysis</topic><topic>Mathematical models</topic><topic>Memory organisation. Data processing</topic><topic>Mining</topic><topic>Process mining</topic><topic>Semantics</topic><topic>Software</topic><topic>Studies</topic><topic>Supply chain management</topic><topic>workflow management</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Greco, G.</creatorcontrib><creatorcontrib>Guzzo, A.</creatorcontrib><creatorcontrib>Pontieri, L.</creatorcontrib><creatorcontrib>Sacca, D.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on knowledge and data engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Greco, G.</au><au>Guzzo, A.</au><au>Pontieri, L.</au><au>Sacca, D.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Discovering expressive process models by clustering log traces</atitle><jtitle>IEEE transactions on knowledge and data engineering</jtitle><stitle>TKDE</stitle><date>2006-08-01</date><risdate>2006</risdate><volume>18</volume><issue>8</issue><spage>1010</spage><epage>1027</epage><pages>1010-1027</pages><issn>1041-4347</issn><eissn>1558-2191</eissn><coden>ITKEEH</coden><abstract>Process mining techniques have recently received notable attention in the literature; for their ability to assist in the (re)design of complex processes by automatically discovering models that explain the events registered in some log traces provided as input. Following this line of research, the paper investigates an extension of such basic approaches, where the identification of different variants for the process is explicitly accounted for, based on the clustering of log traces. Indeed, modeling each group of similar executions with a different schema allows us to single out "conformant" models, which, specifically, minimize the number of modeled enactments that are extraneous to the process semantics. Therefore, a novel process mining framework is introduced and some relevant computational issues are deeply studied. As finding an exact solution to such an enhanced process mining problem is proven to require high computational costs, in most practical cases, a greedy approach is devised. This is founded on an iterative, hierarchical, refinement of the process model, where, at each step, traces sharing similar behavior patterns are clustered together and equipped with a specialized schema. The algorithm guarantees that each refinement leads to an increasingly sound mDdel, thus attaining a monotonic search. Experimental results evidence the validity of the approach with respect to both effectiveness and scalability.</abstract><cop>New York, NY</cop><pub>IEEE</pub><doi>10.1109/TKDE.2006.123</doi><tpages>18</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1041-4347
ispartof	IEEE transactions on knowledge and data engineering, 2006-08, Vol.18 (8), p.1010-1027
issn	1041-4347 1558-2191
language	eng
recordid	cdi_proquest_miscellaneous_1671383891
source	IEEE Electronic Library (IEL)
subjects	Algorithms Applied sciences association rules classification Clustering Clustering algorithms Companies Computational efficiency Computer science control theory systems Computer Society Customer relationship management Data mining Data processing. List processing. Character string processing Enterprise resource planning Exact sciences and technology Exact solutions Information systems. Data bases Iterative algorithms Management information systems Mathematical analysis Mathematical models Memory organisation. Data processing Mining Process mining Semantics Software Studies Supply chain management workflow management
title	Discovering expressive process models by clustering log traces
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T05%3A59%3A03IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Discovering%20expressive%20process%20models%20by%20clustering%20log%20traces&rft.jtitle=IEEE%20transactions%20on%20knowledge%20and%20data%20engineering&rft.au=Greco,%20G.&rft.date=2006-08-01&rft.volume=18&rft.issue=8&rft.spage=1010&rft.epage=1027&rft.pages=1010-1027&rft.issn=1041-4347&rft.eissn=1558-2191&rft.coden=ITKEEH&rft_id=info:doi/10.1109/TKDE.2006.123&rft_dat=%3Cproquest_RIE%3E1671383891%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=912122482&rft_id=info:pmid/&rft_ieee_id=1644726&rfr_iscdi=true