A Methodology for Validating Diversity in Synthetic Time Series Generation

•This paper presents a new method for generating 50K diverse synthetic time series.•We present a discussion on time series characteristics and metrics with a view to understanding time series diversity.•We developed a robust framework for validating diversity in synthetic time series generation. [Di...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:MethodsX 2021-01, Vol.8, p.101459-101459, Article 101459
Hauptverfasser: Bahrpeyma, Fouad, Roantree, Mark, Cappellari, Paolo, Scriney, Michael, McCarren, Andrew
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•This paper presents a new method for generating 50K diverse synthetic time series.•We present a discussion on time series characteristics and metrics with a view to understanding time series diversity.•We developed a robust framework for validating diversity in synthetic time series generation. [Display omitted] In order for researchers to deliver robust evaluations of time series models, it often requires high volumes of data to ensure the appropriate level of rigor in testing. However, for many researchers, the lack of time series presents a barrier to a deeper evaluation. While researchers have developed and used synthetic datasets, the development of this data requires a methodological approach to testing the entire dataset against a set of metrics which capture the diversity of the dataset. Unless researchers are confident that their test datasets display a broad set of time series characteristics, it may favor one type of predictive model over another. This can have the effect of undermining the evaluation of new predictive methods. In this paper, we present a new approach to generating and evaluating a high number of time series data. The construction algorithm and validation framework are described in detail, together with an analysis of the level of diversity present in the synthetic dataset.
ISSN:2215-0161
2215-0161
DOI:10.1016/j.mex.2021.101459