Contrastive Knowledge Amalgamation for Unsupervised Image Classification
Knowledge amalgamation (KA) aims to learn a compact student model to handle the joint objective from multiple teacher models that are are specialized for their own tasks respectively. Current methods focus on coarsely aligning teachers and students in the common representation space, making it diffi...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Knowledge amalgamation (KA) aims to learn a compact student model to handle
the joint objective from multiple teacher models that are are specialized for
their own tasks respectively. Current methods focus on coarsely aligning
teachers and students in the common representation space, making it difficult
for the student to learn the proper decision boundaries from a set of
heterogeneous teachers. Besides, the KL divergence in previous works only
minimizes the probability distribution difference between teachers and the
student, ignoring the intrinsic characteristics of teachers. Therefore, we
propose a novel Contrastive Knowledge Amalgamation (CKA) framework, which
introduces contrastive losses and an alignment loss to achieve intra-class
cohesion and inter-class separation.Contrastive losses intra- and inter- models
are designed to widen the distance between representations of different
classes. The alignment loss is introduced to minimize the sample-level
distribution differences of teacher-student models in the common representation
space.Furthermore, the student learns heterogeneous unsupervised classification
tasks through soft targets efficiently and flexibly in the task-level
amalgamation. Extensive experiments on benchmarks demonstrate the
generalization capability of CKA in the amalgamation of specific task as well
as multiple tasks. Comprehensive ablation studies provide a further insight
into our CKA. |
---|---|
DOI: | 10.48550/arxiv.2307.14781 |