Parking Space Status Inference Upon a Deep CNN and Multi-Task Contrastive Network With Spatial Transform

Deep learning methods, especially CNNs, have achieved many promising results in a wide range of computer vision applications. However, few studies focused on designing suitable deep learning methods for parking space status inference. As we have known, it is challenging to detect parking spaces in a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on circuits and systems for video technology 2019-04, Vol.29 (4), p.1194-1208
Hauptverfasser: Vu, Hoang Tran, Huang, Ching-Chun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Deep learning methods, especially CNNs, have achieved many promising results in a wide range of computer vision applications. However, few studies focused on designing suitable deep learning methods for parking space status inference. As we have known, it is challenging to detect parking spaces in an outdoor environment due to dynamic lighting variations, weather changes, and perspective distortion. By off-the-shelf CNNs, lighting variations might be handled well. However, to realize a practical and robust inference system, we also need to address troublesome problems, such as parking displacements, non-unified car sizes, inter-object occlusion, and perspective distortion. These problems may become even challenging if also considering the difference of space sizes. To overcome the problems, we proposed a custom-tailored deep convolutional and contrastive network with three contributions. First, we introduced a Siamese architecture to learn the contrastive and robust feature descriptor. This helps to reduce the effects owing to the variety of inter-object occlusion. Second, we integrated a convolutional Spatial Transformer Network (STN) to adaptively transform a 3-space input patch according to vehicle sizes and parking displacement. STN also helps to overcome the perspective distortion problem. Third, a multi-task loss function was designed to train the network by simultaneously considering the accuracy of inferring the status of the target space and the semantic smoothness of high-level features. Thereby, the errors caused by inter-object occlusion could be alleviated. To verify the proposed network, we visualized the learned features and analyzed their functionality. Experiments and evaluations have shown the robustness of our system in parking status inference. The real-time system currently running in public parking lots also demonstrates the effectiveness of the proposed deep network.
ISSN:1051-8215
1558-2205
DOI:10.1109/TCSVT.2018.2826053