A hierarchical mixture modeling framework for population synthesis

•We model population data using a two-level nominal categorical structure.•We use a probabilistic tensor factorization to model all categorical attributes.•We apply multilevel latent class model to capture the cross-level associations.•Rejection sampling is used to reproduce the association of house...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Transportation research. Part B: methodological 2018-08, Vol.114, p.199-212
Hauptverfasser: Sun, Lijun, Erath, Alexander, Cai, Ming
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•We model population data using a two-level nominal categorical structure.•We use a probabilistic tensor factorization to model all categorical attributes.•We apply multilevel latent class model to capture the cross-level associations.•Rejection sampling is used to reproduce the association of household members.•We present a case study on generating synthesis population of Singapore. Synthetic population is a key input to agent-based urban/transportation microsimulation models. The objective of population synthesis is to reproduce the underlying statistical properties of real population based on available microsamples and marginal distributions. However, characterizing the joint associations among a large set of attributes is challenging because of the curse of dimensionality, in particular when attributes are organized in a hierarchical household-individual structure. In this paper, we use a hierarchical mixture model to characterize the joint distribution of both household and individual attributes. Based on this model, we propose a framework of generating representative household structures in population synthesis. The framework integrates three models: (1) probabilistic tensor factorization, (2) multilevel latent class model, and (3) rejection sampling. With this framework, one can generalize not only the associations of within- and cross-level attributes, but also reproduce structural relationships among household members (e.g., husband-wife). As a case study, we implement this framework based on the household interview travel survey (HITS) data of Singapore, and then use the inferred model to generate a synthetic population pool. This model demonstrates great potential in reproducing the underlying statistical distribution of real population. The generated synthetic population can serve as a replacement for census in developing agent-based models, with privacy and confidentiality being protected and preserved.
ISSN:0191-2615
1879-2367
DOI:10.1016/j.trb.2018.06.002