Leveraging Different Learning Styles for Improved Knowledge Distillation in Biomedical Imaging
Learning style refers to a type of training mechanism adopted by an individual to gain new knowledge. As suggested by the VARK model, humans have different learning preferences, like Visual (V), Auditory (A), Read/Write (R), and Kinesthetic (K), for acquiring and effectively processing information....
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Learning style refers to a type of training mechanism adopted by an
individual to gain new knowledge. As suggested by the VARK model, humans have
different learning preferences, like Visual (V), Auditory (A), Read/Write (R),
and Kinesthetic (K), for acquiring and effectively processing information. Our
work endeavors to leverage this concept of knowledge diversification to improve
the performance of model compression techniques like Knowledge Distillation
(KD) and Mutual Learning (ML). Consequently, we use a single-teacher and
two-student network in a unified framework that not only allows for the
transfer of knowledge from teacher to students (KD) but also encourages
collaborative learning between students (ML). Unlike the conventional approach,
where the teacher shares the same knowledge in the form of predictions or
feature representations with the student network, our proposed approach employs
a more diversified strategy by training one student with predictions and the
other with feature maps from the teacher. We further extend this knowledge
diversification by facilitating the exchange of predictions and feature maps
between the two student networks, enriching their learning experiences. We have
conducted comprehensive experiments with three benchmark datasets for both
classification and segmentation tasks using two different network architecture
combinations. These experimental results demonstrate that knowledge
diversification in a combined KD and ML framework outperforms conventional KD
or ML techniques (with similar network configuration) that only use predictions
with an average improvement of 2%. Furthermore, consistent improvement in
performance across different tasks, with various network architectures, and
over state-of-the-art techniques establishes the robustness and
generalizability of the proposed model |
---|---|
DOI: | 10.48550/arxiv.2212.02931 |