Distillation embedded absorbable pruning for fast object re-identification
Combining knowledge distillation (KD) and network pruning (NP) shows promise in learning a light network to accelerate object re-identification. However, KD requires an untrained student network to establish more critical connections in early epochs, but NP demands a well-trained student network to...
Gespeichert in:
Veröffentlicht in: | Pattern recognition 2024-08, Vol.152, p.110437, Article 110437 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Combining knowledge distillation (KD) and network pruning (NP) shows promise in learning a light network to accelerate object re-identification. However, KD requires an untrained student network to establish more critical connections in early epochs, but NP demands a well-trained student network to avoid destroying critical connections. This presents a dilemma, potentially leading to a collapse of the student network and harming object Re-ID performance. For that, we propose a distillation embedded absorbable pruning (DEAP) method. We design a pruner-convolution-pruner (PCP) unit to resolve the dilemma by loading NP’s sparse regularization on extra untrained pruners. Additionally, we propose an asymmetric relation knowledge distillation method. It readily transfers feature representation knowledge and asymmetric pairwise similarity knowledge without using additional adaptation modules. Finally, we apply re-parameterization to absorb pruners of PCP units to simplify student networks. Experiments demonstrate the superiority of DEAP, such as on the VeRi-776 dataset, with ResNet-101 as a teacher, DEAP saves 73.24% of model parameters and 71.98% of floating-point operations without sacrificing accuracy.
•We design a distillation embedded absorbable pruning method to readily transfer knowledge a teacher network from to a student network.•We propose an asymmetric relation knowledge distillation method to transfer both feature representation knowledge and pairwise similarity knowledge.•Our method achieves the state-of-the-art performance in terms of both accuracy and speed. |
---|---|
ISSN: | 0031-3203 1873-5142 |
DOI: | 10.1016/j.patcog.2024.110437 |