Entropy Induced Pruning Framework for Convolutional Neural Networks
Structured pruning techniques have achieved great compression performance on convolutional neural networks for image classification task. However, the majority of existing methods are weight-oriented, and their pruning results may be unsatisfactory when the original model is trained poorly. That is,...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Structured pruning techniques have achieved great compression performance on
convolutional neural networks for image classification task. However, the
majority of existing methods are weight-oriented, and their pruning results may
be unsatisfactory when the original model is trained poorly. That is, a
fully-trained model is required to provide useful weight information. This may
be time-consuming, and the pruning results are sensitive to the updating
process of model parameters. In this paper, we propose a metric named Average
Filter Information Entropy (AFIE) to measure the importance of each filter. It
is calculated by three major steps, i.e., low-rank decomposition of the
"input-output" matrix of each convolutional layer, normalization of the
obtained eigenvalues, and calculation of filter importance based on information
entropy. By leveraging the proposed AFIE, the proposed framework is able to
yield a stable importance evaluation of each filter no matter whether the
original model is trained fully. We implement our AFIE based on AlexNet,
VGG-16, and ResNet-50, and test them on MNIST, CIFAR-10, and ImageNet,
respectively. The experimental results are encouraging. We surprisingly observe
that for our methods, even when the original model is only trained with one
epoch, the importance evaluation of each filter keeps identical to the results
when the model is fully-trained. This indicates that the proposed pruning
strategy can perform effectively at the beginning stage of the training process
for the original model. |
---|---|
DOI: | 10.48550/arxiv.2208.06660 |