Optimization framework for DFG-based automated process discovery approaches
The problem of automatically discovering business process models from event logs has been intensely investigated in the past two decades, leading to a wide range of approaches that strike various trade-offs between accuracy, model complexity, and execution time. A few studies have suggested that the...
Gespeichert in:
Veröffentlicht in: | Software and systems modeling 2021-08, Vol.20 (4), p.1245-1270 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1270 |
---|---|
container_issue | 4 |
container_start_page | 1245 |
container_title | Software and systems modeling |
container_volume | 20 |
creator | Augusto, Adriano Dumas, Marlon La Rosa, Marcello Leemans, Sander J. J. vanden Broucke, Seppe K. L. M. |
description | The problem of automatically discovering business process models from event logs has been intensely investigated in the past two decades, leading to a wide range of approaches that strike various trade-offs between accuracy, model complexity, and execution time. A few studies have suggested that the accuracy of automated process discovery approaches can be enhanced by means of metaheuristic optimization techniques. However, these studies have remained at the level of proposals without validation on real-life datasets or they have only considered one metaheuristic in isolation. This article presents a metaheuristic optimization framework for automated process discovery. The key idea of the framework is to construct a directly-follows graph (DFG) from the event log, to perturb this DFG so as to generate new candidate solutions, and to apply a DFG-based automated process discovery approach in order to derive a process model from each DFG. The framework can be instantiated by linking it to an automated process discovery approach, an optimization metaheuristic, and the quality measure to be optimized (e.g., fitness, precision, F-score). The article considers several instantiations of the framework corresponding to four optimization metaheuristics, three automated process discovery approaches (Inductive Miner—directly-follows, Fodina, and Split Miner), and one accuracy measure (Markovian F-score). These framework instances are compared using a set of 20 real-life event logs. The evaluation shows that metaheuristic optimization consistently yields visible improvements in F-score for all the three automated process discovery approaches, at the cost of execution times in the order of minutes, versus seconds for the baseline approaches. |
doi_str_mv | 10.1007/s10270-020-00846-x |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2567803409</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2567803409</sourcerecordid><originalsourceid>FETCH-LOGICAL-c363t-101b8a68c82ee90d4088934be5c9acb3fa75f1270be71443e1e3affab8101b673</originalsourceid><addsrcrecordid>eNp9UD1PwzAQtRBIVKV_gCkSc-Acu7YzokILolIXmC3HPUOA1MFOoOXX4xIEG8Ppnk7vQ_cIOaVwTgHkRaRQSMihSAOKi3x7QEZU0DKnTPLDXyzEMZnEWFcAvChLLsSI3K3arm7qT9PVfpO5YBr88OElcz5kV_NFXpmI68z0nW9Ml1AbvMUYs3UdrX_HsMtMm27GPmE8IUfOvEac_OwxeZhf389u8uVqcTu7XOaWCdblFGiljFBWFYglrDkoVTJe4dSWxlbMGTl1NH1UoaScM6TIjHOmUnulkGxMzgbfFPzWY-z0s-_DJkXqYiqkAsahTKxiYNngYwzodBvqxoSdpqD3vemhN51609-96W0SsUEUE3nziOHP-h_VF8CxcU4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2567803409</pqid></control><display><type>article</type><title>Optimization framework for DFG-based automated process discovery approaches</title><source>SpringerNature Journals</source><creator>Augusto, Adriano ; Dumas, Marlon ; La Rosa, Marcello ; Leemans, Sander J. J. ; vanden Broucke, Seppe K. L. M.</creator><creatorcontrib>Augusto, Adriano ; Dumas, Marlon ; La Rosa, Marcello ; Leemans, Sander J. J. ; vanden Broucke, Seppe K. L. M.</creatorcontrib><description>The problem of automatically discovering business process models from event logs has been intensely investigated in the past two decades, leading to a wide range of approaches that strike various trade-offs between accuracy, model complexity, and execution time. A few studies have suggested that the accuracy of automated process discovery approaches can be enhanced by means of metaheuristic optimization techniques. However, these studies have remained at the level of proposals without validation on real-life datasets or they have only considered one metaheuristic in isolation. This article presents a metaheuristic optimization framework for automated process discovery. The key idea of the framework is to construct a directly-follows graph (DFG) from the event log, to perturb this DFG so as to generate new candidate solutions, and to apply a DFG-based automated process discovery approach in order to derive a process model from each DFG. The framework can be instantiated by linking it to an automated process discovery approach, an optimization metaheuristic, and the quality measure to be optimized (e.g., fitness, precision, F-score). The article considers several instantiations of the framework corresponding to four optimization metaheuristics, three automated process discovery approaches (Inductive Miner—directly-follows, Fodina, and Split Miner), and one accuracy measure (Markovian F-score). These framework instances are compared using a set of 20 real-life event logs. The evaluation shows that metaheuristic optimization consistently yields visible improvements in F-score for all the three automated process discovery approaches, at the cost of execution times in the order of minutes, versus seconds for the baseline approaches.</description><identifier>ISSN: 1619-1366</identifier><identifier>EISSN: 1619-1374</identifier><identifier>DOI: 10.1007/s10270-020-00846-x</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Accuracy ; Automation ; Compilers ; Computer Science ; Heuristic methods ; Information Systems Applications (incl.Internet) ; Interpreters ; IT in Business ; Model accuracy ; Optimization ; Optimization techniques ; Programming Languages ; Programming Techniques ; Regular Paper ; Software Engineering ; Software Engineering/Programming and Operating Systems</subject><ispartof>Software and systems modeling, 2021-08, Vol.20 (4), p.1245-1270</ispartof><rights>The Author(s) 2021</rights><rights>The Author(s) 2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c363t-101b8a68c82ee90d4088934be5c9acb3fa75f1270be71443e1e3affab8101b673</citedby><cites>FETCH-LOGICAL-c363t-101b8a68c82ee90d4088934be5c9acb3fa75f1270be71443e1e3affab8101b673</cites><orcidid>0000-0001-7970-5246</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10270-020-00846-x$$EPDF$$P50$$Gspringer$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10270-020-00846-x$$EHTML$$P50$$Gspringer$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Augusto, Adriano</creatorcontrib><creatorcontrib>Dumas, Marlon</creatorcontrib><creatorcontrib>La Rosa, Marcello</creatorcontrib><creatorcontrib>Leemans, Sander J. J.</creatorcontrib><creatorcontrib>vanden Broucke, Seppe K. L. M.</creatorcontrib><title>Optimization framework for DFG-based automated process discovery approaches</title><title>Software and systems modeling</title><addtitle>Softw Syst Model</addtitle><description>The problem of automatically discovering business process models from event logs has been intensely investigated in the past two decades, leading to a wide range of approaches that strike various trade-offs between accuracy, model complexity, and execution time. A few studies have suggested that the accuracy of automated process discovery approaches can be enhanced by means of metaheuristic optimization techniques. However, these studies have remained at the level of proposals without validation on real-life datasets or they have only considered one metaheuristic in isolation. This article presents a metaheuristic optimization framework for automated process discovery. The key idea of the framework is to construct a directly-follows graph (DFG) from the event log, to perturb this DFG so as to generate new candidate solutions, and to apply a DFG-based automated process discovery approach in order to derive a process model from each DFG. The framework can be instantiated by linking it to an automated process discovery approach, an optimization metaheuristic, and the quality measure to be optimized (e.g., fitness, precision, F-score). The article considers several instantiations of the framework corresponding to four optimization metaheuristics, three automated process discovery approaches (Inductive Miner—directly-follows, Fodina, and Split Miner), and one accuracy measure (Markovian F-score). These framework instances are compared using a set of 20 real-life event logs. The evaluation shows that metaheuristic optimization consistently yields visible improvements in F-score for all the three automated process discovery approaches, at the cost of execution times in the order of minutes, versus seconds for the baseline approaches.</description><subject>Accuracy</subject><subject>Automation</subject><subject>Compilers</subject><subject>Computer Science</subject><subject>Heuristic methods</subject><subject>Information Systems Applications (incl.Internet)</subject><subject>Interpreters</subject><subject>IT in Business</subject><subject>Model accuracy</subject><subject>Optimization</subject><subject>Optimization techniques</subject><subject>Programming Languages</subject><subject>Programming Techniques</subject><subject>Regular Paper</subject><subject>Software Engineering</subject><subject>Software Engineering/Programming and Operating Systems</subject><issn>1619-1366</issn><issn>1619-1374</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNp9UD1PwzAQtRBIVKV_gCkSc-Acu7YzokILolIXmC3HPUOA1MFOoOXX4xIEG8Ppnk7vQ_cIOaVwTgHkRaRQSMihSAOKi3x7QEZU0DKnTPLDXyzEMZnEWFcAvChLLsSI3K3arm7qT9PVfpO5YBr88OElcz5kV_NFXpmI68z0nW9Ml1AbvMUYs3UdrX_HsMtMm27GPmE8IUfOvEac_OwxeZhf389u8uVqcTu7XOaWCdblFGiljFBWFYglrDkoVTJe4dSWxlbMGTl1NH1UoaScM6TIjHOmUnulkGxMzgbfFPzWY-z0s-_DJkXqYiqkAsahTKxiYNngYwzodBvqxoSdpqD3vemhN51609-96W0SsUEUE3nziOHP-h_VF8CxcU4</recordid><startdate>20210801</startdate><enddate>20210801</enddate><creator>Augusto, Adriano</creator><creator>Dumas, Marlon</creator><creator>La Rosa, Marcello</creator><creator>Leemans, Sander J. J.</creator><creator>vanden Broucke, Seppe K. L. M.</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7XB</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0001-7970-5246</orcidid></search><sort><creationdate>20210801</creationdate><title>Optimization framework for DFG-based automated process discovery approaches</title><author>Augusto, Adriano ; Dumas, Marlon ; La Rosa, Marcello ; Leemans, Sander J. J. ; vanden Broucke, Seppe K. L. M.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c363t-101b8a68c82ee90d4088934be5c9acb3fa75f1270be71443e1e3affab8101b673</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Accuracy</topic><topic>Automation</topic><topic>Compilers</topic><topic>Computer Science</topic><topic>Heuristic methods</topic><topic>Information Systems Applications (incl.Internet)</topic><topic>Interpreters</topic><topic>IT in Business</topic><topic>Model accuracy</topic><topic>Optimization</topic><topic>Optimization techniques</topic><topic>Programming Languages</topic><topic>Programming Techniques</topic><topic>Regular Paper</topic><topic>Software Engineering</topic><topic>Software Engineering/Programming and Operating Systems</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Augusto, Adriano</creatorcontrib><creatorcontrib>Dumas, Marlon</creatorcontrib><creatorcontrib>La Rosa, Marcello</creatorcontrib><creatorcontrib>Leemans, Sander J. J.</creatorcontrib><creatorcontrib>vanden Broucke, Seppe K. L. M.</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><jtitle>Software and systems modeling</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Augusto, Adriano</au><au>Dumas, Marlon</au><au>La Rosa, Marcello</au><au>Leemans, Sander J. J.</au><au>vanden Broucke, Seppe K. L. M.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Optimization framework for DFG-based automated process discovery approaches</atitle><jtitle>Software and systems modeling</jtitle><stitle>Softw Syst Model</stitle><date>2021-08-01</date><risdate>2021</risdate><volume>20</volume><issue>4</issue><spage>1245</spage><epage>1270</epage><pages>1245-1270</pages><issn>1619-1366</issn><eissn>1619-1374</eissn><abstract>The problem of automatically discovering business process models from event logs has been intensely investigated in the past two decades, leading to a wide range of approaches that strike various trade-offs between accuracy, model complexity, and execution time. A few studies have suggested that the accuracy of automated process discovery approaches can be enhanced by means of metaheuristic optimization techniques. However, these studies have remained at the level of proposals without validation on real-life datasets or they have only considered one metaheuristic in isolation. This article presents a metaheuristic optimization framework for automated process discovery. The key idea of the framework is to construct a directly-follows graph (DFG) from the event log, to perturb this DFG so as to generate new candidate solutions, and to apply a DFG-based automated process discovery approach in order to derive a process model from each DFG. The framework can be instantiated by linking it to an automated process discovery approach, an optimization metaheuristic, and the quality measure to be optimized (e.g., fitness, precision, F-score). The article considers several instantiations of the framework corresponding to four optimization metaheuristics, three automated process discovery approaches (Inductive Miner—directly-follows, Fodina, and Split Miner), and one accuracy measure (Markovian F-score). These framework instances are compared using a set of 20 real-life event logs. The evaluation shows that metaheuristic optimization consistently yields visible improvements in F-score for all the three automated process discovery approaches, at the cost of execution times in the order of minutes, versus seconds for the baseline approaches.</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s10270-020-00846-x</doi><tpages>26</tpages><orcidid>https://orcid.org/0000-0001-7970-5246</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1619-1366 |
ispartof | Software and systems modeling, 2021-08, Vol.20 (4), p.1245-1270 |
issn | 1619-1366 1619-1374 |
language | eng |
recordid | cdi_proquest_journals_2567803409 |
source | SpringerNature Journals |
subjects | Accuracy Automation Compilers Computer Science Heuristic methods Information Systems Applications (incl.Internet) Interpreters IT in Business Model accuracy Optimization Optimization techniques Programming Languages Programming Techniques Regular Paper Software Engineering Software Engineering/Programming and Operating Systems |
title | Optimization framework for DFG-based automated process discovery approaches |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-20T09%3A45%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Optimization%20framework%20for%20DFG-based%20automated%20process%20discovery%20approaches&rft.jtitle=Software%20and%20systems%20modeling&rft.au=Augusto,%20Adriano&rft.date=2021-08-01&rft.volume=20&rft.issue=4&rft.spage=1245&rft.epage=1270&rft.pages=1245-1270&rft.issn=1619-1366&rft.eissn=1619-1374&rft_id=info:doi/10.1007/s10270-020-00846-x&rft_dat=%3Cproquest_cross%3E2567803409%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2567803409&rft_id=info:pmid/&rfr_iscdi=true |