A novel community answer matching approach based on phrase fusion heterogeneous information network
•To the best of our knowledge, it is the first work to propose the phrase information network and employ it to construct a fusion heterogeneous information network (HIN) to represent complex entity relationships in community question answering (CQA).•We define the distance of entities with the same...
Gespeichert in:
Veröffentlicht in: | Information processing & management 2021-01, Vol.58 (1), p.102408, Article 102408 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •To the best of our knowledge, it is the first work to propose the phrase information network and employ it to construct a fusion heterogeneous information network (HIN) to represent complex entity relationships in community question answering (CQA).•We define the distance of entities with the same or different types in HIN and propose a novel Type-constrained Top-k similarity entity finding algorithm (TTSEF) for answer selection, which innovatively combines entity attributes and semantic features to achieve answer selection in CQA.•Abundant experimental demonstrate that proposed algorithm precedes the state-of-the-art similar entity matching methods in CQA.•A meta-path analysis of the optimal matching answers proves that phrase can serve as a bridge to connect different types of entities in CQA effectively.
Community Question Answering (CQA) allows users to ask or answer questions in a social way, so it is becoming the primary means for people acquiring knowledge. However, the asker must wait until a satisfactory answer appears, which reduces user activity. In this paper, we propose an innovative answering method that matches the most relevant answers for the new issue automatically. Firstly, we utilize phrases to represent the semantic of the posts (answers/questions) and construct a Phrase Fusion Heterogeneous Information Network, called PFHIN, to represent complex entity relationships in CQA. So, the answer selection is regarded as the related entity retrieval task. Then, we define the distance between entities in PFHIN, which is independent of the meta path. Finally, the Type-constrained Top-k Similarity Entity Finding Algorithm (TTSEF) is proposed for finding the nearest entities according to the known start entity and end-entity type, which can match the most relevant answers automatically.To the best of our knowledge, it is the first work to define the phrase information network for answer selection and provide a novel idea for the heterogeneous information network fusion. Experimental results on three large-scale datasets (Stack Overflow, Super User, and Mathematics) from Stack Exchange demonstrate that our proposed approaches significantly outperform the state-of-the-art answer retrieval methods. Moreover, we conduct an in-depth analysis of the meta path to the optimal answer and reveal the critical role of phrases in community answer matching. |
---|---|
ISSN: | 0306-4573 1873-5371 |
DOI: | 10.1016/j.ipm.2020.102408 |