Deep learning based sequence to sequence model for abstractive telugu text summarization

With the emergence of deep learning, the attention of researchers has increased significantly towards abstractive text summarization approaches. Though extractive text summarization (ETS) is an important approach, the generated summaries are not always coherent. This paper mainly focuses on the abst...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Multimedia tools and applications 2023-05, Vol.82 (11), p.17075-17096
Hauptverfasser: Babu, G. L. Anand, Badugu, Srinivasu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:With the emergence of deep learning, the attention of researchers has increased significantly towards abstractive text summarization approaches. Though extractive text summarization (ETS) is an important approach, the generated summaries are not always coherent. This paper mainly focuses on the abstractive text summarization (ATS) approach for Telugu language to generate coherent summary. The majority research on ATS approach is conducted in English, while no significant research in Telugu has been documented. An abstractive Telugu text summarization model based on sequence-to-sequence (seq2seq) encoder-decoder architecture is proposed in this paper. The seq2seq model is implemented with bidirectional long short-term memory (Bi-LSTM) based encoder and long short-term memory (LSTM) based decoder. The existing ATS approaches have some drawbacks such as they cannot handle out vocabulary words, attention deficiency issue arising while handling long text sequence and repetition problem. To overcome these issues, some operating mechanisms like pointer generator network, temporal attention mechanism and coverage mechanism are also integrated in the proposed model. Besides, diverse beam search decoding algorithm is also employed to increase the diversity of generated summary. Thus, the proposed seq2seq model is the combination of Bi-LSTM and LSTM based encoder-decoder, pointer generator network, temporal attention mechanism, coverage mechanism and diverse beam search decoding algorithm. The performance of the proposed work is evaluated using the ROUGE toolkit in terms of F-measure, recall and precision. The experimental results of the proposed scheme are evaluated with other existing methods to show that the proposed ATS model outperforms existing Telugu text summarization models.
ISSN:1380-7501
1573-7721
DOI:10.1007/s11042-022-14099-x