Markov Chain Monte Carlo Multiple Imputation Using Bayesian Networks for Incomplete Intelligent Transportation Systems Data

The rich data on intelligent transportation systems (ITS) are a precious resource for transportation researchers and practitioners. However, the usability of this resource is greatly limited by missing data. Many imputation methods have been proposed in the past decade. However, some issues are stil...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Transportation research record 2005-01, Vol.1935 (1935), p.57-67
Hauptverfasser: Ni, Daiheng, Leonard II, John
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The rich data on intelligent transportation systems (ITS) are a precious resource for transportation researchers and practitioners. However, the usability of this resource is greatly limited by missing data. Many imputation methods have been proposed in the past decade. However, some issues are still not addressed or are not sufficiently addressed, for example, the missing of entire records, temporal correlation in observations, natural characteristics in raw data, and unbiased estimates for missing values. This paper proposes an advanced imputation method based on recent development in other disciplines, especially applied statistics. The method uses a Bayesian network to learn from the raw data and a Markov chain Monte Carlo technique to sample from the probability distributions learned by the Bayesian network. It imputes the missing data multiple times and makes statistical inferences about the result. In addition, the method incorporates a time series model so that it allows data missing in entire rows-an unfavorable missing pattern frequently seen in ITS data. Empirical study shows that the proposed method is robust and accurate. It is ideal for use as a high-quality imputation method for off-line application.
ISSN:0361-1981
DOI:10.3141/1935-07