A novel self-attention enriching mechanism for biomedical question answering
•Novel self-attention enriching mechanism for biomedical question answering.•New stat-of-the-art results for factoid biomedical question answering.•Correlation between transformer attention scores and final prediction is validated. The task of biomedical question answering is a subtask of the more g...
Gespeichert in:
Veröffentlicht in: | Expert systems with applications 2023-09, Vol.225, p.120210, Article 120210 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •Novel self-attention enriching mechanism for biomedical question answering.•New stat-of-the-art results for factoid biomedical question answering.•Correlation between transformer attention scores and final prediction is validated.
The task of biomedical question answering is a subtask of the more general question answering task, that is concerned only with biomedical questions. The current state-of-the-art models in this task like BioBERT, and BioM-ELECTRA are all based on the transformer architecture. The self-attention layer in the transformer plays a central role in the model predictions. Recent studies on the inner-workings of the transformer self-attention layer in the case of question answering hypothesize that context passage tokens with bigger attention scores have a bigger probability of being part of the predicted answer. Starting from this hypothesis, we experimented with a novel self-attention enriching mechanism for biomedical question answering targeting factoid and list type questions. In our approach, we enrich BioBERT’s self-attention layer with biomedical and named entity information previously extracted from the question and the context passage. The proposed enriching mechanism increases the attention scores for the biomedical and named entities. Which are in most cases the answer to the question. This increase in attention scores influences the model final prediction as hypothesized. Our proposed method achieves state-of-the-art results on several batches of the BioASQ’s 10b, 9b, 8b, and 7b datasets. |
---|---|
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2023.120210 |