CLMIP: cross-layer manifold invariance based pruning method of deep convolutional neural network for real-time road type recognition

Recently, deep learning based models have demonstrated the superiority in a variety of visual tasks like object detection and instance segmentation. In practical applications, deploying advanced networks into real-time applications such as autonomous driving is still challenging due to expensive com...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Multidimensional systems and signal processing 2021, Vol.32 (1), p.239-262
Hauptverfasser: Zheng, Qinghe, Tian, Xinyu, Yang, Mingqiang, Su, Huake
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Recently, deep learning based models have demonstrated the superiority in a variety of visual tasks like object detection and instance segmentation. In practical applications, deploying advanced networks into real-time applications such as autonomous driving is still challenging due to expensive computational cost and memory footprint. In this paper, to reduce the size of deep convolutional neural network (CNN) and accelerate its reasoning, we propose a cross-layer manifold invariance based pruning method named CLMIP for network compression to help it complete real-time road type recognition in low-cost vision system. Manifolds are higher-dimensional analogues of curves and surfaces, which can be self-organized to reflect the data distribution and characterize the relationship between data. Therefore, we hope to guarantee the generalization ability of deep CNN by maintaining the consistency of the data manifolds of each layer in the network, and then remove the parameters with less influence on the manifold structure. Therefore, CLMIP can be regarded as a tool to further investigate the dependence of model structure on network optimization and generalization. To the best of our knowledge, this is the first time to prune deep CNN based on the invariance of data manifolds. During experimental process, we use the python based keyword crawler program to collect 102 first-view videos of car cameras, including 137 200 images (320 × 240) of four road scenes (urban road, off-road, trunk road and motorway). Finally, the classification results have demonstrated that CLMIP can achieve state-of-the-art performance with a speed of 26 FPS on NVIDIA Jetson Nano.
ISSN:0923-6082
1573-0824
DOI:10.1007/s11045-020-00736-x