Multi-stage deep learning approaches to predict boarding behaviour of bus passengers
•Developing a multi-stage framework to predict boarding patterns of bus passengers.•Addressing the data issues arising from the imbalanced data and many-class issues.•Applying deep learning algorithms to solve the multi-label classification problem.•High accuracy on predicting bus ridership at stop-...
Gespeichert in:
Veröffentlicht in: | Sustainable cities and society 2021-10, Vol.73, p.103111, Article 103111 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •Developing a multi-stage framework to predict boarding patterns of bus passengers.•Addressing the data issues arising from the imbalanced data and many-class issues.•Applying deep learning algorithms to solve the multi-label classification problem.•High accuracy on predicting bus ridership at stop-, line- and network-level.•RNN and LSTM accurately predict the temporal characteristics of the ridership; FCN provides better prediction on the spatial distribution.
Smart card data has emerged in recent years and provide a comprehensive, and cheap source of information for planning and managing public transport systems. This paper presents a multi-stage machine learning framework to predict passengers’ boarding stops using smart card data. The framework addresses the challenges arising from the imbalanced nature of the data (e.g. many non-travelling data) and the ‘many-class’ issues (e.g. many possible boarding stops) by decomposing the prediction of hourly ridership into three stages: whether to travel or not in that one-hour time slot, which bus line to use, and at which stop to board. A simple neural network architecture, fully connected networks (FCN), and two deep learning architectures, recurrent neural networks (RNN) and long short-term memory networks (LSTM) are implemented. The proposed approach is applied to a real-life bus network. We show that the data imbalance has a profound impact on the accuracy of prediction at individual level. At aggregated level, FCN is able to accurately predict the rideship at individual stops, it is poor at capturing the temporal distribution of ridership. RNN and LSTM are able to measure the temporal distribution but lack the ability to capture the spatial distribution through bus lines. |
---|---|
ISSN: | 2210-6707 2210-6715 |
DOI: | 10.1016/j.scs.2021.103111 |