To aggregate or to eliminate? Optimal model simplification for improved process performance prediction

•A technique for performance-driven model reduction of GSPNs is proposed.•The technique relies on foldings that aggregate or eliminate performance information.•Foldings preserve model stability and have a bound for the introduced performance estimation error.•Given a budget for the estimation error,...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Information systems (Oxford) 2018-11, Vol.78, p.96-111
Hauptverfasser:	Senderovich, Arik, Shleyfman, Alexander, Weidlich, Matthias, Gal, Avigdor, Mandelbaum, Avishai
Format:	Artikel
Sprache:	eng
Schlagworte:	Aggregation Algorithms Data mining Design optimization Domains Elimination Folding Generalised stochastic Petri nets Information systems Model reduction Model Simplification Performance evaluation Performance measurement Performance prediction Petri nets Process Mining Simplification Stochastic models
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	•A technique for performance-driven model reduction of GSPNs is proposed.•The technique relies on foldings that aggregate or eliminate performance information.•Foldings preserve model stability and have a bound for the introduced performance estimation error.•Given a budget for the estimation error, an optimal sequence of foldings can be found. Operational process models such as generalised stochastic Petri nets (GSPNs) are useful when answering performance questions about business processes (e.g. ‘how long will it take for a case to finish?’). Recently, methods for process mining have been developed to discover and enrich operational models based on a log of recorded executions of processes, which enables evidence-based process analysis. To avoid a bias due to infrequent execution paths, discovery algorithms strive for a balance between over-fitting and under-fitting regarding the originating log. However, state-of-the-art discovery algorithms address this balance solely for the control-flow dimension, neglecting the impact of their design choices in terms of performance measures. In this work, we thus offer a technique for controlled performance-driven model reduction of GSPNs, using structural simplification rules, namely foldings. We propose a set of foldings that aggregate or eliminate performance information. We further prove the soundness of these foldings in terms of stability preservation and provide bounds on the error that they introduce with respect to the original model. Furthermore, we show how to find an optimal sequence of simplification rules, such that their application yields a minimal model under a given error budget for performance estimation. We evaluate the approach with two real-world datasets from the healthcare and telecommunication domains, showing that model simplification indeed enables a controlled reduction of model size, while preserving performance metrics with respect to the original model. Moreover, we show that aggregation dominates elimination when abstracting performance models by preventing under-fitting due to information loss.
ISSN:	0306-4379 1873-6076
DOI:	10.1016/j.is.2018.04.003