Multi-view pre-trained transformer via hierarchical capsule network for answer sentence selection

Answer selection requires technology that effectively captures in-depth semantic information between the question and the corresponding answer. Most existing studies focus on using linear or pooling operations to directly classify the output representation, resulting in the absence of critical infor...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Applied intelligence (Dordrecht, Netherlands) Netherlands), 2024-11, Vol.54 (21), p.10561-10580
Hauptverfasser:	Li, Bing, Yang, Peng, Sun, Yuankang, Hu, Zhongjian, Yi, Meng
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial Intelligence Coders Computer Science Data integration Datasets Labels Machines Manufacturing Mechanical Engineering Processes Representations Semantics
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Answer selection requires technology that effectively captures in-depth semantic information between the question and the corresponding answer. Most existing studies focus on using linear or pooling operations to directly classify the output representation, resulting in the absence of critical information and the emergence of single-label predictions. To address these issues, we propose a novel M ulti-view P re-trained T ransformer with H ierarchical C apsule N etwork (MPT-HCN). Specifically, we propose a Hierarchical Capsule Network composed of three capsule networks to independently process high-dimensional sparse information of words, semantic information of similar expressions, and feature classification information so that multiple attributes can be fully considered and accurately clustered. Moreover, we consider the impact of the intermediate encoder layer output information on the overall sequence semantic representation and propose a Multi-view Information Fusion that obtains the final semantic representation information by weighted fusion of the output information of all encoder layers, thereby avoiding the appearance of a single prediction label. Extensive experiments on five typical representative datasets, especially on the WikiQA dataset, show that our model MPT-HCN (RL) achieves an excellent performance of 0.939 on MAP and 0.942 on MRR, which is a significant improvement of 3.9% and 2.7% respectively, compared to the state-of-the-art baseline model.
ISSN:	0924-669X 1573-7497
DOI:	10.1007/s10489-024-05513-y