Estimating Global Completeness of Event Logs: A Comparative Study

Event logs are the basis of process mining techniques and tools that extract process behavior information for better understanding and optimization of business processes. While it has been widely realized that the degree of completeness of event logs may largely determine the effectiveness of these...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on services computing 2021-03, Vol.14 (2), p.441-457
Hauptverfasser: Pei, Jisheng, Wen, Lijie, Yang, Hedong, Wang, Jianmin, Ye, Xiaojun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Event logs are the basis of process mining techniques and tools that extract process behavior information for better understanding and optimization of business processes. While it has been widely realized that the degree of completeness of event logs may largely determine the effectiveness of these techniques, how to estimate the completeness of event logs has not yet been fully addressed. This is mainly because ground-truth process models are usually unknown. To attack this problem, we pay a closer look to several concepts and implicit assumptions in the log completeness estimation problem and characterize it as a special case of the species estimation problem in the field of statistics. Although species estimation is still an open problem, a number of statistic models and techniques with approximate solutions have been available. To investigate the relevance of these methods for event log completeness estimation, we have designed and conducted a wide scope of empirical study and quantitative experiments on both real-world and synthesized event logs to compare the performance of these methods. In addition, the completeness estimation of several important and widely used real-world events logs are reported for the first time together with some best practice experience learned through this research.
ISSN:1939-1374
2372-0204
DOI:10.1109/TSC.2018.2805912