Thinking in Categories: A Survey on Assessing the Quality for Time Series Synthesis

Time series data are widely used and provide a wealth of information for countless applications. However, some applications are faced with a limited amount of data, or the data cannot be used due to confidentiality concerns. To overcome these obstacles, time series can be generated synthetically. Fo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ACM journal of data and information quality 2024-06, Vol.16 (2), p.1-32, Article 14
Hauptverfasser: Stenger, Michael, Bauer, André, Prantl, Thomas, Leppich, Robert, Hudson, Nathaniel, Chard, Kyle, Foster, Ian, Kounev, Samuel
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Time series data are widely used and provide a wealth of information for countless applications. However, some applications are faced with a limited amount of data, or the data cannot be used due to confidentiality concerns. To overcome these obstacles, time series can be generated synthetically. For example, electrocardiograms can be synthesized to make them available for building models to predict conditions such as cardiac arrhythmia without leaking patient information. Although many different approaches to time series synthesis have been proposed, evaluating the quality of synthetic time series data poses unique challenges and remains an open problem, as there is a lack of a clear definition of what constitutes a “good” synthesis. To this end, we present a comprehensive literature survey to identify different aspects of synthesis quality and their relationships. Based on this, we propose a definition of synthesis quality and a systematic evaluation procedure for assessing it. With this work, we aim to provide a common language and criteria for evaluating synthetic time series data. Our goal is to promote more rigorous and reproducible research in time series synthesis by enabling researchers and practitioners to generate high-quality synthetic time series data.
ISSN:1936-1955
1936-1963
DOI:10.1145/3666006