CR-SFP: Learning Consistent Representation for Soft Filter Pruning
Soft filter pruning~(SFP) has emerged as an effective pruning technique for allowing pruned filters to update and the opportunity for them to regrow to the network. However, this pruning strategy applies training and pruning in an alternative manner, which inevitably causes inconsistent representati...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Soft filter pruning~(SFP) has emerged as an effective pruning technique for
allowing pruned filters to update and the opportunity for them to regrow to the
network. However, this pruning strategy applies training and pruning in an
alternative manner, which inevitably causes inconsistent representations
between the reconstructed network~(R-NN) at the training and the pruned
network~(P-NN) at the inference, resulting in performance degradation. In this
paper, we propose to mitigate this gap by learning consistent representation
for soft filter pruning, dubbed as CR-SFP. Specifically, for each training
step, CR-SFP optimizes the R-NN and P-NN simultaneously with different
distorted versions of the same training data, while forcing them to be
consistent by minimizing their posterior distribution via the bidirectional
KL-divergence loss. Meanwhile, the R-NN and P-NN share backbone parameters thus
only additional classifier parameters are introduced. After training, we can
export the P-NN for inference. CR-SFP is a simple yet effective training
framework to improve the accuracy of P-NN without introducing any additional
inference cost. It can also be combined with a variety of pruning criteria and
loss functions. Extensive experiments demonstrate our CR-SFP achieves
consistent improvements across various CNN architectures. Notably, on ImageNet,
our CR-SFP reduces more than 41.8\% FLOPs on ResNet18 with 69.2\% top-1
accuracy, improving SFP by 2.1\% under the same training settings. The code
will be publicly available on GitHub. |
---|---|
DOI: | 10.48550/arxiv.2312.11555 |