Graph-based extractive text summarization based on single document

Day by day, the amount of online and offline text data is growing tremendously from various sources like legal documents, medical documents, news articles, etc. Manual text summarization of large documents is unfeasible and costly because it takes much time and requires more effort. As a consequence...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Multimedia tools and applications 2024-02, Vol.83 (7), p.18987-19013
Hauptverfasser: Yadav, Avaneesh Kumar, Ranvijay, Yadav, Rama Shankar, Maurya, Ashish Kumar
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Day by day, the amount of online and offline text data is growing tremendously from various sources like legal documents, medical documents, news articles, etc. Manual text summarization of large documents is unfeasible and costly because it takes much time and requires more effort. As a consequence, various graph-based text summarization techniques have been designed which provide thoroughly and well-prepared summaries of documents. The problems issues that exist in these techniques are redundancy of data, loss of information and readability. To overcome these problems, we have proposed a textual graph-based extractive text summarization technique called TGETS, for extracting essential information from a single document. In the proposed approach, a graph’s node is denoted as group of sentences in the document and an edge of the graph is represented as an association between two sentences. The summary generation is based on the sum of sentence weight and the average weight of the textual graph. The performance of proposed approach is evaluated on the BBC news articles dataset through the ROUGE-metric ( R 1 and R 2 ). The proposed approach in the range of 100-200 words length summary offers better scores of 19.88%, 38.76%, and 30.73% for R 1 under precision, recall and F 1 -score with respect to the existing PageRank (PR) method. Similarly, for R 2 , the proposed approach exceeds by 32%, 26.99%, and 29.01% for precision, recall, and F 1 -score with respect to existing PageRank (PR) method.
ISSN:1573-7721
1380-7501
1573-7721
DOI:10.1007/s11042-023-16199-8