Language Model Meets Prototypes: Towards Interpretable Text Classification Models through Prototypical Networks
Pretrained transformer-based Language Models (LMs) are well-known for their ability to achieve significant improvement on NLP tasks, but their black-box nature, which leads to a lack of interpretability, has been a major concern. My dissertation focuses on developing intrinsically interpretable mode...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Pretrained transformer-based Language Models (LMs) are well-known for their
ability to achieve significant improvement on NLP tasks, but their black-box
nature, which leads to a lack of interpretability, has been a major concern. My
dissertation focuses on developing intrinsically interpretable models when
using LMs as encoders while maintaining their superior performance via
prototypical networks. I initiated my research by investigating enhancements in
performance for interpretable models of sarcasm detection. My proposed approach
focuses on capturing sentiment incongruity to enhance accuracy while offering
instance-based explanations for the classification decisions. Later, I
developed a novel white-box multi-head graph attention-based prototype network
designed to explain the decisions of text classification models without
sacrificing the accuracy of the original black-box LMs. In addition, I am
working on extending the attention-based prototype network with contrastive
learning to redesign an interpretable graph neural network, aiming to enhance
both the interpretability and performance of the model in document
classification. |
---|---|
DOI: | 10.48550/arxiv.2412.03761 |