LAP: An Attention-Based Module for Concept Based Self-Interpretation and Knowledge Injection in Convolutional Neural Networks

Despite the state-of-the-art performance of deep convolutional neural networks, they are susceptible to bias and malfunction in unseen situations. Moreover, the complex computation behind their reasoning is not human-understandable to develop trust. External explainer methods have tried to interpret...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2023-10
Hauptverfasser:	Rassa Ghavami Modegh, Salimi, Ahmad, Dizaji, Alireza, Rabiee, Hamid R
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial neural networks Neural networks
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Despite the state-of-the-art performance of deep convolutional neural networks, they are susceptible to bias and malfunction in unseen situations. Moreover, the complex computation behind their reasoning is not human-understandable to develop trust. External explainer methods have tried to interpret network decisions in a human-understandable way, but they are accused of fallacies due to their assumptions and simplifications. On the other side, the inherent self-interpretability of models, while being more robust to the mentioned fallacies, cannot be applied to the already trained models. In this work, we propose a new attention-based pooling layer, called Local Attention Pooling (LAP), that accomplishes self-interpretability and the possibility for knowledge injection without performance loss. The module is easily pluggable into any convolutional neural network, even the already trained ones. We have defined a weakly supervised training scheme to learn the distinguishing features in decision-making without depending on experts' annotations. We verified our claims by evaluating several LAP-extended models on two datasets, including ImageNet. The proposed framework offers more valid human-understandable and faithful-to-the-model interpretations than the commonly used white-box explainer methods.
ISSN:	2331-8422