LAP: An Attention-Based Module for Concept Based Self-Interpretation and Knowledge Injection in Convolutional Neural Networks

Despite the state-of-the-art performance of deep convolutional neural networks, they are susceptible to bias and malfunction in unseen situations. Moreover, the complex computation behind their reasoning is not human-understandable to develop trust. External explainer methods have tried to interpret...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2023-10
Hauptverfasser: Rassa Ghavami Modegh, Salimi, Ahmad, Dizaji, Alireza, Rabiee, Hamid R
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Despite the state-of-the-art performance of deep convolutional neural networks, they are susceptible to bias and malfunction in unseen situations. Moreover, the complex computation behind their reasoning is not human-understandable to develop trust. External explainer methods have tried to interpret network decisions in a human-understandable way, but they are accused of fallacies due to their assumptions and simplifications. On the other side, the inherent self-interpretability of models, while being more robust to the mentioned fallacies, cannot be applied to the already trained models. In this work, we propose a new attention-based pooling layer, called Local Attention Pooling (LAP), that accomplishes self-interpretability and the possibility for knowledge injection without performance loss. The module is easily pluggable into any convolutional neural network, even the already trained ones. We have defined a weakly supervised training scheme to learn the distinguishing features in decision-making without depending on experts' annotations. We verified our claims by evaluating several LAP-extended models on two datasets, including ImageNet. The proposed framework offers more valid human-understandable and faithful-to-the-model interpretations than the commonly used white-box explainer methods.
ISSN:2331-8422