Using derivatives in time series classification

Over recent years the popularity of time series has soared. Given the widespread use of modern information technology, a large number of time series may be collected during business, medical or biological operations, for example. As a consequence there has been a dramatic increase in the amount of i...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Data mining and knowledge discovery 2013-03, Vol.26 (2), p.310-331
Hauptverfasser:	Gorecki, Tomasz, Luczak, Maciej
Format:	Artikel
Sprache:	eng
Schlagworte:	Approximation Artificial Intelligence Business Chemistry and Earth Sciences Classification Computer Science Data mining Data Mining and Knowledge Discovery Datasets Derivatives Experiments Indexing Information Storage and Retrieval Information systems Mathematical analysis Physics Similarity measures Statistics for Engineering Time series
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Over recent years the popularity of time series has soared. Given the widespread use of modern information technology, a large number of time series may be collected during business, medical or biological operations, for example. As a consequence there has been a dramatic increase in the amount of interest in querying and mining such data, which in turn has resulted in a large number of works introducing new methodologies for indexing, classification, clustering and approximation of time series. In particular, many new distance measures between time series have been introduced. In this paper, we propose a new distance function based on a derivative. In contrast to well-known measures from the literature, our approach considers the general shape of a time series rather than point-to-point function comparison. The new distance is used in classification with the nearest neighbor rule. In order to provide a comprehensive comparison, we conducted a set of experiments, testing effectiveness on 20 time series datasets from a wide variety of application domains. Our experiments show that our method provides a higher quality of classification on most of the examined datasets.
ISSN:	1384-5810 1573-756X
DOI:	10.1007/s10618-012-0251-4