Input data selection for daily traffic flow forecasting through contextual mining and intra-day pattern recognition

•Select appropriate historical days of data to enhance forecasting performance.•Utilize contextual information to measure the similarities between data and target.•Match target day with the clustered group with ordered contextual factors. There is a large amount of literature about the traffic flow...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Expert systems with applications 2021-08, Vol.176, p.114902, Article 114902
Hauptverfasser:	Ma, Dongfang, Song, Xiang Ben, Zhu, Jiacheng, Ma, Weihao
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Clustering Data mining Flow distribution Forecasting Genetic algorithms Input data selection NSGA-II Pattern recognition Similarity Sorting algorithms Traffic flow Traffic flow forecasting
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	•Select appropriate historical days of data to enhance forecasting performance.•Utilize contextual information to measure the similarities between data and target.•Match target day with the clustered group with ordered contextual factors. There is a large amount of literature about the traffic flow forecasting and most existing studies focus on prediction algorithm itself. However, how to select the appropriate historical data as input is also vital for the prediction task, while such studies are limited. This paper aims to cover this gap and proposes a method to select the appropriate historical data for daily traffic flow forecasting. The main idea is that some contextual factors including season, day of the week, weather, and holiday, influence the daily traffic flow pattern, and we select historical days with the similar pattern to the target day as the training data for prediction algorithm. The method consists of three steps: first, the similarities for traffic flow series between any two days are measured by Dynamic Time Warping, and then historical days are divided into different groups using a density-peak clustering algorithm; Second, the contextual factors are sorted by Elitist Non-dominated Sorting Genetic Algorithm (NSGA-II) using the clustering results, and their degrees of importance are transformed into weights in order to better measure the degrees of similarity between the clustered groups of days and the target day; third, one clustered group of historical data is selected based on the weighted degree of similarity and this group is used as the input for the prediction algorithm. At last, the benefits of the new method are discussed based on a Seattle case study, which illustrates that the proposed approach has higher prediction accuracy and stability across various prediction algorithms.
ISSN:	0957-4174 1873-6793
DOI:	10.1016/j.eswa.2021.114902