Multi-view pre-trained transformer via hierarchical capsule network for answer sentence selection
Answer selection requires technology that effectively captures in-depth semantic information between the question and the corresponding answer. Most existing studies focus on using linear or pooling operations to directly classify the output representation, resulting in the absence of critical infor...
Gespeichert in:
Veröffentlicht in: | Applied intelligence (Dordrecht, Netherlands) Netherlands), 2024-11, Vol.54 (21), p.10561-10580 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Answer selection requires technology that effectively captures in-depth semantic information between the question and the corresponding answer. Most existing studies focus on using linear or pooling operations to directly classify the output representation, resulting in the absence of critical information and the emergence of single-label predictions. To address these issues, we propose a novel
M
ulti-view
P
re-trained
T
ransformer with
H
ierarchical
C
apsule
N
etwork (MPT-HCN). Specifically, we propose a Hierarchical Capsule Network composed of three capsule networks to independently process high-dimensional sparse information of words, semantic information of similar expressions, and feature classification information so that multiple attributes can be fully considered and accurately clustered. Moreover, we consider the impact of the intermediate encoder layer output information on the overall sequence semantic representation and propose a Multi-view Information Fusion that obtains the final semantic representation information by weighted fusion of the output information of all encoder layers, thereby avoiding the appearance of a single prediction label. Extensive experiments on five typical representative datasets, especially on the WikiQA dataset, show that our model MPT-HCN (RL) achieves an excellent performance of 0.939 on MAP and 0.942 on MRR, which is a significant improvement of 3.9% and 2.7% respectively, compared to the state-of-the-art baseline model. |
---|---|
ISSN: | 0924-669X 1573-7497 |
DOI: | 10.1007/s10489-024-05513-y |