MaxQ: Multi-Axis Query for N:M Sparsity Network
N:M sparsity has received increasing attention due to its remarkable performance and latency trade-off compared with structured and unstructured sparsity. However, existing N:M sparsity methods do not differentiate the relative importance of weights among blocks and leave important weights underappr...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | N:M sparsity has received increasing attention due to its remarkable
performance and latency trade-off compared with structured and unstructured
sparsity. However, existing N:M sparsity methods do not differentiate the
relative importance of weights among blocks and leave important weights
underappreciated. Besides, they directly apply N:M sparsity to the whole
network, which will cause severe information loss. Thus, they are still
sub-optimal. In this paper, we propose an efficient and effective Multi-Axis
Query methodology, dubbed as MaxQ, to rectify these problems. During the
training, MaxQ employs a dynamic approach to generate soft N:M masks,
considering the weight importance across multiple axes. This method enhances
the weights with more importance and ensures more effective updates. Meanwhile,
a sparsity strategy that gradually increases the percentage of N:M weight
blocks is applied, which allows the network to heal from the pruning-induced
damage progressively. During the runtime, the N:M soft masks can be precomputed
as constants and folded into weights without causing any distortion to the
sparse pattern and incurring additional computational overhead. Comprehensive
experiments demonstrate that MaxQ achieves consistent improvements across
diverse CNN architectures in various computer vision tasks, including image
classification, object detection and instance segmentation. For ResNet50 with
1:16 sparse pattern, MaxQ can achieve 74.6\% top-1 accuracy on ImageNet and
improve by over 2.8\% over the state-of-the-art. Codes and checkpoints are
available at \url{https://github.com/JingyangXiang/MaxQ}. |
---|---|
DOI: | 10.48550/arxiv.2312.07061 |