A Framework for Similarity Search in Streaming Time Series based on Spark Streaming
Similarity search in streaming time series is a challenging problem due to tight requirements in processing streaming data and replying feedback, e.g., quickly processing a time-series stream of high speed, and accurately replying found results to a query system. These difficulties urge researchers...
Gespeichert in:
Veröffentlicht in: | Mobile networks and applications 2022-10, Vol.27 (5), p.2084-2097 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Similarity search in streaming time series is a challenging problem due to tight requirements in processing streaming data and replying feedback, e.g., quickly processing a time-series stream of high speed, and accurately replying found results to a query system. These difficulties urge researchers of time-series data mining to have a framework at hand for building systems of similarity search in streaming time series based on a platform specializing in handling streaming data. In the paper, we introduce a framework of similarity search in streaming time series based on Spark Streaming. Subsequently, a prototype system implementing the framework would be proposed to demonstrate the feasibility of the framework for building similarity search systems which can work efficiently and effectively in streaming context. In addition, the prototype system takes advantages of SUCR-DTW to perform similarity search efficiently in streaming environment under Dynamic Time Warping. The experimental results obtained from the prototype system demonstrate that the Spark job of similarity search in streaming time series is accomplished quickly and accurately. The subsequences of streaming time series, which are similar to predefined queries, are found in near real time. They are the same as those obtained from the execution of similarity search in streaming time series by another reference system. Furthermore, the prototype system has high scalability, stably works while processing time-series streams of high steady rate. These experimental results also underline the value of the combination of Spark Streaming and SUCR-DTW to handle the challenging problem. |
---|---|
ISSN: | 1383-469X 1572-8153 |
DOI: | 10.1007/s11036-022-01988-6 |