Hybrid method for text summarization based on statistical and semantic treatment
Text summarization presents several challenges such as considering semantic relationships among words, dealing with redundancy and information diversity issues. Seeking to overcome these problems, we propose in this paper a new graph-based Arabic summarization system that combines statistical and se...
Gespeichert in:
Veröffentlicht in: | Multimedia tools and applications 2021-05, Vol.80 (13), p.19567-19600 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Text summarization presents several challenges such as considering semantic relationships among words, dealing with redundancy and information diversity issues. Seeking to overcome these problems, we propose in this paper a new graph-based Arabic summarization system that combines statistical and semantic analysis. The proposed approach utilizes ontology hierarchical structure and relations to provide a more accurate similarity measurement between terms in order to improve the quality of the summary. The proposed method is based on a two-dimensional graph model that makes uses statistical and semantic similarities. The statistical similarity is based on the content overlap between two sentences, while the semantic similarity is computed using the semantic information extracted from a lexical database whose use enables our system to apply reasoning by measuring semantic distance between real human concepts. The weighted ranking algorithm PageRank is performed on the graph to produce significant score for all document sentences. The score of each sentence is performed by adding other statistical features. In addition, we address redundancy and information diversity issues by using an adapted version of Maximal Marginal Relevance method. Experimental results on EASC and our own datasets showed the effectiveness of our proposed approach over existing summarization systems. |
---|---|
ISSN: | 1380-7501 1573-7721 |
DOI: | 10.1007/s11042-021-10613-9 |