Exploring heterogeneous features for query-focused summarization of categorized community answers

Community-based question answering (cQA) is a popular type of online knowledge-sharing web service where users ask questions and obtain answers contributed by others. To enhance knowledge sharing, cQA also provides users with a retrieval function to access the historical question-answer pairs (QAs)....

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Information sciences 2016-02, Vol.330, p.403-423
Hauptverfasser:	Wei, Wei, Ming, ZhaoYan, Nie, Liqiang, Li, Guohui, Li, Jianjun, Zhu, Feida, Shang, Tianfeng, Luo, Changyin
Format:	Artikel
Sprache:	eng
Schlagworte:	Categories Community-based question answering Graph-based ranking Historic Markov models Mathematical models Query processing Ranking Redundancy Retrieval Summarization
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Community-based question answering (cQA) is a popular type of online knowledge-sharing web service where users ask questions and obtain answers contributed by others. To enhance knowledge sharing, cQA also provides users with a retrieval function to access the historical question-answer pairs (QAs). However, it is still ineffective in that the retrieval result is typically a ranking list of potentially relevant QAs, rather than a succinct and informative answer. To alleviate the problem, this paper proposes a three-level scheme, which aims to generate a query-focused summary-style answer in terms of two factors, i.e., novelty and redundancy. Specifically, we first retrieve a set of QAs to the given query, and then develop a smoothed Naive Bayes model to identify the topics of answers, by exploiting their associated category information. Next, to compute the global ranking scores of answers, we first propose a parameterized graph-based method to model a Markov random walk on a graph that is parameterized by the heterogeneous features of answers, and then combine the ranking scores with the relevance scores of answers. Based on the computed global ranking scores, we utilize two different strategies to construct top-K candidate answer set, and finally solve a constrained optimization problem on the sentence set of top-K answers to generate a summary towards a user’s query. Experiments on real-world data demonstrate the effectiveness of our proposed approach as compared to the baselines.
ISSN:	0020-0255 1872-6291
DOI:	10.1016/j.ins.2015.10.024