Cross-scale generative adversarial network for crowd density estimation from images

This research develops a cross-scale convolutional spatial generative adversarial network (CSGAN), in order to estimate the crowd density from images accurately. It consists of two similar generators, one for the whole feature extraction, and the other for patch scale feature extraction. An encoder–...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Engineering applications of artificial intelligence 2020-09, Vol.94, p.103777, Article 103777
Hauptverfasser:	Zhang, Gaowei, Pan, Yue, Zhang, Limao, Tiong, Robert Lee Kong
Format:	Artikel
Sprache:	eng
Schlagworte:	Automation & Control Systems Computer Science Computer Science, Artificial Intelligence Crowd density estimation Deconvolution convolutions Engineering Engineering, Electrical & Electronic Engineering, Multidisciplinary Generative adversarial network Loss function Science & Technology Technology
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This research develops a cross-scale convolutional spatial generative adversarial network (CSGAN), in order to estimate the crowd density from images accurately. It consists of two similar generators, one for the whole feature extraction, and the other for patch scale feature extraction. An encoder–decoder structure is employed to generate density maps from input images or patches. Additionally, a new objective function for crowd counting called cross-scale consistency pursuit containing an adversarial loss, L2 loss, perceptual loss, and consistency loss, is developed to make the generated density maps more realistic and closer to the ground truth. The effectiveness of the proposed CSGAN is verified in two public datasets. Results indicate that the new objective function is able to reach the most satisfying value of evaluation metrics in both the low-density and high-density crowd scenes when it is compared with other state-of-the-art methods on the test datasets. Moreover, the proposed CSGAN is more practical and flexible due to the smaller computational complexity. Its estimation capability will be significantly improved even in a small size of training data. Overall, this research contributes to the development of a novel computer vision approach together with a new objective function to generate density maps from cross-scale crowd images, enabling the counting process more accurately and efficiently. •A cross-scale convolutional spatial Generative Adversarial Network is developed.•A new objective function for crowd counting called cross-scale consistency pursuit is proposed.•Estimation capability is significantly improved even with a small size of training data.•It is adaptive in both low-density and high-density crowd scenes.•The estimation accuracy of the proposed model is better than state-of-the-art models.
ISSN:	0952-1976 1873-6769
DOI:	10.1016/j.engappai.2020.103777