Guided Zoom: Zooming into Network Evidence to Refine Fine-Grained Model Decisions

In state-of-the-art deep single-label classification models, the top-k k (k=2,3,4, \dots) (k=2,3,4,⋯) accuracy is usually significantly higher than the top-1 accuracy. This is more evident in fine-grained datasets, where differences between classes are quite subtle. Exploiting the information prov...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on pattern analysis and machine intelligence 2021-11, Vol.43 (11), p.4196-4202
Hauptverfasser:	Bargal, Sarah Adel, Zunino, Andrea, Petsiuk, Vitali, Zhang, Jianming, Saenko, Kate, Murino, Vittorio, Sclaroff, Stan
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Annotations Artificial neural networks Classification classification refinement Conditional probability convolutional neural networks Correlation Datasets Decisions Explainable AI fine-grained image classification Grounding Location awareness Model accuracy Predictions Predictive models saliency Testing time Training Visualization Zooming
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In state-of-the-art deep single-label classification models, the top-k k (k=2,3,4, \dots) (k=2,3,4,⋯) accuracy is usually significantly higher than the top-1 accuracy. This is more evident in fine-grained datasets, where differences between classes are quite subtle. Exploiting the information provided in the top k k predicted classes boosts the final prediction of a model. We propose Guided Zoom, a novel way in which explainability could be used to improve model performance. We do so by making sure the model has "the right reasons" for a prediction. The reason/evidence upon which a deep neural network makes a prediction is defined to be the grounding, in the pixel space, for a specific class conditional probability in the model output. Guided Zoom examines how reasonable the evidence used to make each of the top-k k predictions is. Test time evidence is deemed reasonable if it is coherent with evidence used to make similar correct decisions at training time. This leads to better informed predictions. We explore a variety of grounding techniques and study their complementarity for computing evidence. We show that Guided Zoom results in an improvement of a model's classification accuracy and achieves state-of-the-art classification performance on four fine-grained classification datasets. Our code is available at https://github.com/andreazuna89/Guided-Zoom .
ISSN:	0162-8828 1939-3539 2160-9292
DOI:	10.1109/TPAMI.2021.3054303