Text summarization using BART

The present expansion of unstructured, text-based data in the digital environment necessitates the development of automatic text summary systems that allow users to easily extract insights from them. We now have instant access to massive amounts of information. However, most of this data is superflu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Adhik, Chintalwar, Sri Lakshmi, Sonti, Muralidharan, C.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The present expansion of unstructured, text-based data in the digital environment necessitates the development of automatic text summary systems that allow users to easily extract insights from them. We now have instant access to massive amounts of information. However, most of this data is superfluous, inconsequential, and may not convey the intended message. To locate the information needed from an online news piece, the user may have to sift through the content and spend a significant amount of time removing extraneous stuff. As a result, using computerized text summarizers that can extract vital information while removing inessential and unnecessary material is becoming increasingly crucial. Implementing summarizing can make documents simpler to read and reduce the amount of time spent searching for information and enable you to cram more details into a smaller area. In general, NLP uses two approaches to summarize texts: extraction and abstraction. ROUGE stands for Recall Oriented Understanding for Gist Evaluation. In this study, ROUGE measurements were used for the summary procedure. In this text summary regressive transformers with a bi-directional auto-encoder, BART has been used as the model. It is a de-noising auto-encoder for seq-to-seq model pre-training. A bidirectional (like BERT) encoder and an autoregressive (like GPT) decoder are combined in the transformer encoder-decoder (seq2seq) paradigm known as BART. The recommended model produced an output summary with a quality score of about 78 percent for a given input text.
ISSN:0094-243X
1551-7616
DOI:10.1063/5.0217004