Improving and evaluating complex question answering over knowledge bases by constructing strongly supervised data

Complex question answering (CQA) is widely used in real-world tasks such as search engines and intelligent customer service. With the development of large-scale knowledge bases, CQA over knowledge bases has attracted considerable attention in recent years. However, there are many types of complex qu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Neural computing & applications 2023-03, Vol.35 (7), p.5513-5533
Hauptverfasser: Cao, Xing, Zhao, Yingsi, Shen, Bo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Complex question answering (CQA) is widely used in real-world tasks such as search engines and intelligent customer service. With the development of large-scale knowledge bases, CQA over knowledge bases has attracted considerable attention in recent years. However, there are many types of complex questions, and few works deeply focus on the performance analysis of models for different types of questions. Another major challenge is the lack of complete supervised labels due to the expense of manual labelling, decreasing model interpretability and increasing the difficulty of model training. In this paper, we constructed a dataset, named CoSuQue , which includes multiple types of complex questions and complete supervised labels that are easily obtained. Our work provides an in-depth analysis of the model’s ability to answer different types of questions, contributing a comprehensive evaluation of the performance of CQA models. Based on the ability of the model to handle different types of questions, the model structure can be improved in a more targeted manner. The different types of complex questions and the complete supervised labels allow the inference process of the model to be investigated. Furthermore, we propose a novel training method that leverages the proposed dataset to improve the performance of the model on other publicly available datasets. Experiments on the Complex WebQuestions and WebQuestionsSP datasets demonstrate the effectiveness of our approach on the CQA task.
ISSN:0941-0643
1433-3058
DOI:10.1007/s00521-022-07965-0