Deep Ensemble Collaborative Learning by using Knowledge-transfer Graph for Fine-grained Object Classification
Mutual learning, in which multiple networks learn by sharing their knowledge, improves the performance of each network. However, the performance of ensembles of networks that have undergone mutual learning does not improve significantly from that of normal ensembles without mutual learning, even tho...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Mutual learning, in which multiple networks learn by sharing their knowledge,
improves the performance of each network. However, the performance of ensembles
of networks that have undergone mutual learning does not improve significantly
from that of normal ensembles without mutual learning, even though the
performance of each network has improved significantly. This may be due to the
relationship between the knowledge in mutual learning and the individuality of
the networks in the ensemble. In this study, we propose an ensemble method
using knowledge transfer to improve the accuracy of ensembles by introducing a
loss design that promotes diversity among networks in mutual learning. We use
an attention map as knowledge, which represents the probability distribution
and information in the middle layer of a network. There are many ways to
combine networks and loss designs for knowledge transfer methods. Therefore, we
use the automatic optimization of knowledge-transfer graphs to consider a
variety of knowledge-transfer methods by graphically representing conventional
mutual-learning and distillation methods and optimizing each element through
hyperparameter search. The proposed method consists of a mechanism for
constructing an ensemble in a knowledge-transfer graph, attention loss, and a
loss design that promotes diversity among networks. We explore optimal ensemble
learning by optimizing a knowledge-transfer graph to maximize ensemble
accuracy. From exploration of graphs and evaluation experiments using the
datasets of Stanford Dogs, Stanford Cars, and CUB-200-2011, we confirm that the
proposed method is more accurate than a conventional ensemble method. |
---|---|
DOI: | 10.48550/arxiv.2103.14845 |