Mahalanobis Distance Metric Learning Algorithm for Instance-based Data Stream Classification

With the massive data challenges nowadays and the rapid growing of technology, stream mining has recently received considerable attention. To address the large number of scenarios in which this phenomenon manifests itself suitable tools are required in various research fields. Instance-based data st...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2016-04
Hauptverfasser: Rivero Perez, Jorge Luis, Ribeiro, Bernardete, Carlos Morell Perez
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:With the massive data challenges nowadays and the rapid growing of technology, stream mining has recently received considerable attention. To address the large number of scenarios in which this phenomenon manifests itself suitable tools are required in various research fields. Instance-based data stream algorithms generally employ the Euclidean distance for the classification task underlying this problem. A novel way to look into this issue is to take advantage of a more flexible metric due to the increased requirements imposed by the data stream scenario. In this paper we present a new algorithm that learns a Mahalanobis metric using similarity and dissimilarity constraints in an online manner. This approach hybridizes a Mahalanobis distance metric learning algorithm and a k-NN data stream classification algorithm with concept drift detection. First, some basic aspects of Mahalanobis distance metric learning are described taking into account key properties as well as online distance metric learning algorithms. Second, we implement specific evaluation methodologies and comparative metrics such as Q statistic for data stream classification algorithms. Finally, our algorithm is evaluated on different datasets by comparing its results with one of the best instance-based data stream classification algorithm of the state of the art. The results demonstrate that our proposal is better
ISSN:2331-8422