Gradient-Guided Knowledge Distillation for Object Detectors
Deep learning models have demonstrated remarkable success in object detection, yet their complexity and computational intensity pose a barrier to deploying them in real-world applications (e.g., self-driving perception). Knowledge Distillation (KD) is an effective way to derive efficient models. How...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Deep learning models have demonstrated remarkable success in object
detection, yet their complexity and computational intensity pose a barrier to
deploying them in real-world applications (e.g., self-driving perception).
Knowledge Distillation (KD) is an effective way to derive efficient models.
However, only a small number of KD methods tackle object detection. Also, most
of them focus on mimicking the plain features of the teacher model but rarely
consider how the features contribute to the final detection. In this paper, we
propose a novel approach for knowledge distillation in object detection, named
Gradient-guided Knowledge Distillation (GKD). Our GKD uses gradient information
to identify and assign more weights to features that significantly impact the
detection loss, allowing the student to learn the most relevant features from
the teacher. Furthermore, we present bounding-box-aware multi-grained feature
imitation (BMFI) to further improve the KD performance. Experiments on the
KITTI and COCO-Traffic datasets demonstrate our method's efficacy in knowledge
distillation for object detection. On one-stage and two-stage detectors, our
GKD-BMFI leads to an average of 5.1% and 3.8% mAP improvement, respectively,
beating various state-of-the-art KD methods. |
---|---|
DOI: | 10.48550/arxiv.2303.04240 |