Finding BERT errors by clustering activation vectors

•The non-linearity of deep neural networks makes their interpretability opaque and reduces the verifiability of the system where these models are applied.•We describe a systematic approach to identify the clusters with the most misclassifications (or wrong label annotations).•We extract the activati...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Future generation computer systems 2025-05, Vol.166, p.107601, Article 107601
Hauptverfasser:	Andreopoulos, William B., Lopez, Dominic, Rojas, Carlos, Bhandare, Vedashree P.
Format:	Artikel
Sprache:	eng
Schlagworte:	Activation vectors Attention BERT Clustering K-means t-SNE Transformer
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	•The non-linearity of deep neural networks makes their interpretability opaque and reduces the verifiability of the system where these models are applied.•We describe a systematic approach to identify the clusters with the most misclassifications (or wrong label annotations).•We extract the activation vectors from a deep learning model, DNABERT, and visualize them using t-SNE.•We use the resulting clusters to identify misclassifications of DNA sequences or problems with sequence tagging.•We analyze the cluster means’ Euclidean distances, looking for frequent discrepancies between predicted labels and actual labels. The non-linear nature of deep neural networks makes it difficult to interpret the reason behind their output, thus reducing verifiability of the system where these models are applied. Understanding the patterns between activation vectors and predictions could give insight as to erroneous classifications and how to identify them. This paper explains a systematic approach to identifying the clusters with the most misclassifications or false label annotations. For this research, we extracted the activation vectors from a deep learning model, DNABERT, and visualized them using t-SNE to decode the reason behind the results that are produced. We applied K-means in a hierarchical fashion on the activation vectors for a set of training instances. We analyzed cluster mean activation vectors to find any patterns in the errors across K-means clusters. The cluster analysis revealed that the predictions were uniform, or nearly 100 percent the same, in clusters of similar activation vectors. It was found that two clusters containing most of their objects belonging to the same true class tend to be closer together than clusters of opposite classes. The means of objects of the same true label are closer if two clusters have the same predicted labels rather than opposite predicted labels, showing that the activation vectors reflect both predicted and true classes. We did a similar analysis for all 26 organisms in the dataset, showing the Euclidean distance can be used for identifying clusters with many errors. We propose a heuristic to find the clusters with a high number of misclassifications or incorrect label annotations using the vector analysis between clusters. This can aid in identifying misclassifications of DNA sequences or problems with sequence tagging. [Display omitted]
ISSN:	0167-739X
DOI:	10.1016/j.future.2024.107601