Deep neural network pruning method based on sensitive layers and reinforcement learning

It is of great significance to compress neural network models so that they can be deployed on resource-constrained embedded mobile devices. However, due to the lack of theoretical guidance for non-salient network components, existing model compression methods are inefficient and labor-intensive. In...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Artificial intelligence review 2023-11, Vol.56 (Suppl 2), p.1897-1917
Hauptverfasser: Yang, Wenchuan, Yu, Haoran, Cui, Baojiang, Sui, Runqi, Gu, Tianyu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:It is of great significance to compress neural network models so that they can be deployed on resource-constrained embedded mobile devices. However, due to the lack of theoretical guidance for non-salient network components, existing model compression methods are inefficient and labor-intensive. In this paper, we propose a new pruning method to achieve model compression. By exploring the rank ordering of the feature maps of convolutional layers, we introduce the concept of sensitive layers and treat layers with more low-rank feature maps as sensitive layers. We propose a new algorithm for finding sensitive layers while using reinforcement learning deterministic strategies to automate pruning for insensitive layers. Experimental results show that our method achieves significant improvements over the state-of-the-art in floating-point operations and parameter reduction, with lower precision loss. For example, using ResNet-110 on CIFAR-10 achieves a 62.2% reduction in floating-point operations by removing 63.9% of parameters. When testing ResNet-50 on ImageNet, our method reduces floating-point operations by 53.8% by deleting 39.9% of the parameters.
ISSN:0269-2821
1573-7462
DOI:10.1007/s10462-023-10566-5