Bengali Abstractive News Summarization(BANS): A Neural Attention Approach

Abstractive summarization is the process of generating novel sentences based on the information extracted from the original text document while retaining the context. Due to abstractive summarization's underlying complexities, most of the past research work has been done on the extractive summa...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2020-12
Hauptverfasser:	Bhattacharjee, Prithwiraj, Mallick, Avi, Md Saiful Islam, Marium-E-Jannat
Format:	Artikel
Sprache:	eng
Schlagworte:	Coders Computer Science - Artificial Intelligence Computer Science - Computation and Language Datasets Encoders-Decoders News Sentences
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Bhattacharjee, Prithwiraj Mallick, Avi Md Saiful Islam Marium-E-Jannat
description	Abstractive summarization is the process of generating novel sentences based on the information extracted from the original text document while retaining the context. Due to abstractive summarization's underlying complexities, most of the past research work has been done on the extractive summarization approach. Nevertheless, with the triumph of the sequence-to-sequence (seq2seq) model, abstractive summarization becomes more viable. Although a significant number of notable research has been done in the English language based on abstractive summarization, only a couple of works have been done on Bengali abstractive news summarization (BANS). In this article, we presented a seq2seq based Long Short-Term Memory (LSTM) network model with attention at encoder-decoder. Our proposed system deploys a local attention-based model that produces a long sequence of words with lucid and human-like generated sentences with noteworthy information of the original document. We also prepared a dataset of more than 19k articles and corresponding human-written summaries collected from bangla.bdnews24.com1 which is till now the most extensive dataset for Bengali news document summarization and publicly published in Kaggle2. We evaluated our model qualitatively and quantitatively and compared it with other published results. It showed significant improvement in terms of human evaluation scores with state-of-the-art approaches for BANS.
doi_str_mv	10.48550/arxiv.2012.01747
format	Article
fullrecord	<record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_2012_01747</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2467241859</sourcerecordid><originalsourceid>FETCH-LOGICAL-a529-fa7536f6a2bcd58c605b1561d9afdf6d6cb53d7bd26071f95f913d77cde611ac3</originalsourceid><addsrcrecordid>eNotj0tPwkAUhScmJhLkB7hyEje6aJ1HZ6Z1VwgqCcEF7JvbeWhJaXHa4uPXO4Crm5tzcvJ9CN1QEiepEOQR_Hd1iBmhLCZUJeoCjRjnNEoTxq7QpOu2hBAmFROCj9Biapt3qCucl13vQffVweKV_erwetjtwFe_0Fdtcz_NV-uHJ5yHbPBQ47zvbXNMcL7f-xb0xzW6dFB3dvJ_x2jzPN_MXqPl28tili8jECyLHCjBpZPASm1EqiURJRWSmgyccdJIXQpuVGmYJIq6TLiMhl9pYyWloPkY3Z5nT57F3leB8qc4-hYn39C4OzcC1-dgu77YtoNvAlPBkuCd0FRk_A8HclfI</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2467241859</pqid></control><display><type>article</type><title>Bengali Abstractive News Summarization(BANS): A Neural Attention Approach</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Bhattacharjee, Prithwiraj ; Mallick, Avi ; Md Saiful Islam ; Marium-E-Jannat</creator><creatorcontrib>Bhattacharjee, Prithwiraj ; Mallick, Avi ; Md Saiful Islam ; Marium-E-Jannat</creatorcontrib><description>Abstractive summarization is the process of generating novel sentences based on the information extracted from the original text document while retaining the context. Due to abstractive summarization's underlying complexities, most of the past research work has been done on the extractive summarization approach. Nevertheless, with the triumph of the sequence-to-sequence (seq2seq) model, abstractive summarization becomes more viable. Although a significant number of notable research has been done in the English language based on abstractive summarization, only a couple of works have been done on Bengali abstractive news summarization (BANS). In this article, we presented a seq2seq based Long Short-Term Memory (LSTM) network model with attention at encoder-decoder. Our proposed system deploys a local attention-based model that produces a long sequence of words with lucid and human-like generated sentences with noteworthy information of the original document. We also prepared a dataset of more than 19k articles and corresponding human-written summaries collected from bangla.bdnews24.com1 which is till now the most extensive dataset for Bengali news document summarization and publicly published in Kaggle2. We evaluated our model qualitatively and quantitatively and compared it with other published results. It showed significant improvement in terms of human evaluation scores with state-of-the-art approaches for BANS.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2012.01747</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Coders ; Computer Science - Artificial Intelligence ; Computer Science - Computation and Language ; Datasets ; Encoders-Decoders ; News ; Sentences</subject><ispartof>arXiv.org, 2020-12</ispartof><rights>2020. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,777,781,882,27906</link.rule.ids><backlink>$$Uhttps://doi.org/10.1007/978-981-33-4673-4_4$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.48550/arXiv.2012.01747$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Bhattacharjee, Prithwiraj</creatorcontrib><creatorcontrib>Mallick, Avi</creatorcontrib><creatorcontrib>Md Saiful Islam</creatorcontrib><creatorcontrib>Marium-E-Jannat</creatorcontrib><title>Bengali Abstractive News Summarization(BANS): A Neural Attention Approach</title><title>arXiv.org</title><description>Abstractive summarization is the process of generating novel sentences based on the information extracted from the original text document while retaining the context. Due to abstractive summarization's underlying complexities, most of the past research work has been done on the extractive summarization approach. Nevertheless, with the triumph of the sequence-to-sequence (seq2seq) model, abstractive summarization becomes more viable. Although a significant number of notable research has been done in the English language based on abstractive summarization, only a couple of works have been done on Bengali abstractive news summarization (BANS). In this article, we presented a seq2seq based Long Short-Term Memory (LSTM) network model with attention at encoder-decoder. Our proposed system deploys a local attention-based model that produces a long sequence of words with lucid and human-like generated sentences with noteworthy information of the original document. We also prepared a dataset of more than 19k articles and corresponding human-written summaries collected from bangla.bdnews24.com1 which is till now the most extensive dataset for Bengali news document summarization and publicly published in Kaggle2. We evaluated our model qualitatively and quantitatively and compared it with other published results. It showed significant improvement in terms of human evaluation scores with state-of-the-art approaches for BANS.</description><subject>Coders</subject><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computation and Language</subject><subject>Datasets</subject><subject>Encoders-Decoders</subject><subject>News</subject><subject>Sentences</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GOX</sourceid><recordid>eNotj0tPwkAUhScmJhLkB7hyEje6aJ1HZ6Z1VwgqCcEF7JvbeWhJaXHa4uPXO4Crm5tzcvJ9CN1QEiepEOQR_Hd1iBmhLCZUJeoCjRjnNEoTxq7QpOu2hBAmFROCj9Biapt3qCucl13vQffVweKV_erwetjtwFe_0Fdtcz_NV-uHJ5yHbPBQ47zvbXNMcL7f-xb0xzW6dFB3dvJ_x2jzPN_MXqPl28tili8jECyLHCjBpZPASm1EqiURJRWSmgyccdJIXQpuVGmYJIq6TLiMhl9pYyWloPkY3Z5nT57F3leB8qc4-hYn39C4OzcC1-dgu77YtoNvAlPBkuCd0FRk_A8HclfI</recordid><startdate>20201203</startdate><enddate>20201203</enddate><creator>Bhattacharjee, Prithwiraj</creator><creator>Mallick, Avi</creator><creator>Md Saiful Islam</creator><creator>Marium-E-Jannat</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20201203</creationdate><title>Bengali Abstractive News Summarization(BANS): A Neural Attention Approach</title><author>Bhattacharjee, Prithwiraj ; Mallick, Avi ; Md Saiful Islam ; Marium-E-Jannat</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a529-fa7536f6a2bcd58c605b1561d9afdf6d6cb53d7bd26071f95f913d77cde611ac3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Coders</topic><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computation and Language</topic><topic>Datasets</topic><topic>Encoders-Decoders</topic><topic>News</topic><topic>Sentences</topic><toplevel>online_resources</toplevel><creatorcontrib>Bhattacharjee, Prithwiraj</creatorcontrib><creatorcontrib>Mallick, Avi</creatorcontrib><creatorcontrib>Md Saiful Islam</creatorcontrib><creatorcontrib>Marium-E-Jannat</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>arXiv Computer Science</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Bhattacharjee, Prithwiraj</au><au>Mallick, Avi</au><au>Md Saiful Islam</au><au>Marium-E-Jannat</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Bengali Abstractive News Summarization(BANS): A Neural Attention Approach</atitle><jtitle>arXiv.org</jtitle><date>2020-12-03</date><risdate>2020</risdate><eissn>2331-8422</eissn><abstract>Abstractive summarization is the process of generating novel sentences based on the information extracted from the original text document while retaining the context. Due to abstractive summarization's underlying complexities, most of the past research work has been done on the extractive summarization approach. Nevertheless, with the triumph of the sequence-to-sequence (seq2seq) model, abstractive summarization becomes more viable. Although a significant number of notable research has been done in the English language based on abstractive summarization, only a couple of works have been done on Bengali abstractive news summarization (BANS). In this article, we presented a seq2seq based Long Short-Term Memory (LSTM) network model with attention at encoder-decoder. Our proposed system deploys a local attention-based model that produces a long sequence of words with lucid and human-like generated sentences with noteworthy information of the original document. We also prepared a dataset of more than 19k articles and corresponding human-written summaries collected from bangla.bdnews24.com1 which is till now the most extensive dataset for Bengali news document summarization and publicly published in Kaggle2. We evaluated our model qualitatively and quantitatively and compared it with other published results. It showed significant improvement in terms of human evaluation scores with state-of-the-art approaches for BANS.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2012.01747</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2020-12
issn	2331-8422
language	eng
recordid	cdi_arxiv_primary_2012_01747
source	arXiv.org; Free E- Journals
subjects	Coders Computer Science - Artificial Intelligence Computer Science - Computation and Language Datasets Encoders-Decoders News Sentences
title	Bengali Abstractive News Summarization(BANS): A Neural Attention Approach
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T07%3A09%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Bengali%20Abstractive%20News%20Summarization(BANS):%20A%20Neural%20Attention%20Approach&rft.jtitle=arXiv.org&rft.au=Bhattacharjee,%20Prithwiraj&rft.date=2020-12-03&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2012.01747&rft_dat=%3Cproquest_arxiv%3E2467241859%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2467241859&rft_id=info:pmid/&rfr_iscdi=true