To aggregate or to eliminate? Optimal model simplification for improved process performance prediction
•A technique for performance-driven model reduction of GSPNs is proposed.•The technique relies on foldings that aggregate or eliminate performance information.•Foldings preserve model stability and have a bound for the introduced performance estimation error.•Given a budget for the estimation error,...
Gespeichert in:
Veröffentlicht in: | Information systems (Oxford) 2018-11, Vol.78, p.96-111 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •A technique for performance-driven model reduction of GSPNs is proposed.•The technique relies on foldings that aggregate or eliminate performance information.•Foldings preserve model stability and have a bound for the introduced performance estimation error.•Given a budget for the estimation error, an optimal sequence of foldings can be found.
Operational process models such as generalised stochastic Petri nets (GSPNs) are useful when answering performance questions about business processes (e.g. ‘how long will it take for a case to finish?’). Recently, methods for process mining have been developed to discover and enrich operational models based on a log of recorded executions of processes, which enables evidence-based process analysis. To avoid a bias due to infrequent execution paths, discovery algorithms strive for a balance between over-fitting and under-fitting regarding the originating log. However, state-of-the-art discovery algorithms address this balance solely for the control-flow dimension, neglecting the impact of their design choices in terms of performance measures. In this work, we thus offer a technique for controlled performance-driven model reduction of GSPNs, using structural simplification rules, namely foldings. We propose a set of foldings that aggregate or eliminate performance information. We further prove the soundness of these foldings in terms of stability preservation and provide bounds on the error that they introduce with respect to the original model. Furthermore, we show how to find an optimal sequence of simplification rules, such that their application yields a minimal model under a given error budget for performance estimation. We evaluate the approach with two real-world datasets from the healthcare and telecommunication domains, showing that model simplification indeed enables a controlled reduction of model size, while preserving performance metrics with respect to the original model. Moreover, we show that aggregation dominates elimination when abstracting performance models by preventing under-fitting due to information loss. |
---|---|
ISSN: | 0306-4379 1873-6076 |
DOI: | 10.1016/j.is.2018.04.003 |