Dynamic Shuffle: An Efficient Channel Mixture Method
The redundancy of Convolutional neural networks not only depends on weights but also depends on inputs. Shuffling is an efficient operation for mixing channel information but the shuffle order is usually pre-defined. To reduce the data-dependent redundancy, we devise a dynamic shuffle module to gene...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The redundancy of Convolutional neural networks not only depends on weights
but also depends on inputs. Shuffling is an efficient operation for mixing
channel information but the shuffle order is usually pre-defined. To reduce the
data-dependent redundancy, we devise a dynamic shuffle module to generate
data-dependent permutation matrices for shuffling. Since the dimension of
permutation matrix is proportional to the square of the number of input
channels, to make the generation process efficiently, we divide the channels
into groups and generate two shared small permutation matrices for each group,
and utilize Kronecker product and cross group shuffle to obtain the final
permutation matrices. To make the generation process learnable, based on
theoretical analysis, softmax, orthogonal regularization, and binarization are
employed to asymptotically approximate the permutation matrix. Dynamic shuffle
adaptively mixes channel information with negligible extra computation and
memory occupancy. Experiment results on image classification benchmark datasets
CIFAR-10, CIFAR-100, Tiny ImageNet and ImageNet have shown that our method
significantly increases ShuffleNets' performance. Adding dynamic generated
matrix with learnable static matrix, we further propose static-dynamic-shuffle
and show that it can serve as a lightweight replacement of ordinary pointwise
convolution. |
---|---|
DOI: | 10.48550/arxiv.2310.02776 |