Comprehensive evaluation of multiple machine learning classifiers for predicting freeway incident duration

This study compares the accuracy and complexity of eleven machine learning classifiers for the problem of incident duration prediction. The proposed framework integrates feature selection and modeling techniques to evaluate the effect of multiple influencing factors and choose the best model for pre...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Innovative infrastructure solutions : the official journal of the Soil-Structure Interaction Group in Egypt (SSIGE) 2023-06, Vol.8 (6), Article 177
Hauptverfasser: Hamad, Khaled, Obaid, Lubna, Nassif, Ali Bou, Abu Dabous, Saleh, Al-Ruzouq, Rami, Zeiada, Waleed
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This study compares the accuracy and complexity of eleven machine learning classifiers for the problem of incident duration prediction. The proposed framework integrates feature selection and modeling techniques to evaluate the effect of multiple influencing factors and choose the best model for predicting incident durations. Models were developed and tested using an incident dataset collected from the Houston TranStar incidents archive, including more than 110,000 records. Features were selected based on integrating information gain, correlation-based, and relief-based evaluators’ results. The developed and fine-tuned classifiers were compared in terms of multiple accuracy measures (precision, recall, F-1 score, and AUC) and complexity measures (memory storage, training time, and testing times). Overall, results showed that among the developed models, the support vector machines (SVM), K-Nearest Neighborhoods, and Gaussian processes classification outperformed other classifiers with a prediction accuracy of 97%. The Decision Tree classifier recorded the lowest performance with a prediction accuracy of 82%. Considering a trade-off between the model’s accuracy and complexity, the classifier with higher accuracy associated with low training time complexity was the K-Nearest Neighborhoods achieving an accuracy of 97%, 0.024 s of training time, 0.042 s of testing time, and a memory storage of 0.04 megabytes. Nevertheless, the SVM achieved the same accuracy of 97% yet consumed much lower memory storage of 0.004 megabytes and a testing time of 0.01 s. Although the K-NN recorded the lowest training time, the SVM can be considered the best model for the ID-prediction classification problem.
ISSN:2364-4176
2364-4184
DOI:10.1007/s41062-023-01138-1