A novel extractive text summarization system with self-organizing map clustering and entity recognition

Extractive text summarization yields the sensitive parts of the document by neglecting the irrelevant and redundant information. In this paper, we propose a new strategy for extractive single-document summarization in Malayalam. Initially, entity recognition is done, followed by relevance analysis i...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Sadhana (Bangalore) 2020-12, Vol.45 (1), Article 32
Hauptverfasser:	RAHUL RAJ, M, HAROON, ROSNA P, SOBHANA, N V
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Clustering Engineering Evaluation Game theory Impact analysis On-line systems Recognition Self organizing maps Sentences
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue	1
container_start_page
container_title	Sadhana (Bangalore)
container_volume	45
creator	RAHUL RAJ, M HAROON, ROSNA P SOBHANA, N V
description	Extractive text summarization yields the sensitive parts of the document by neglecting the irrelevant and redundant information. In this paper, we propose a new strategy for extractive single-document summarization in Malayalam. Initially, entity recognition is done, followed by relevance analysis is made based on some context-aware features. The scored sentences are then clustered using self-organizing maps (SOM) and from these clusters, relevant sentences are extracted out based on the proposed algorithm. Both theoretical and practical evaluations are done to analyze the implemented system. In theoretical evaluation, gradient calculations of relevance equations are used to know that which of these sentence scoring features are contributing more. The relevance equation is optimized with the help of Lagrange’s multiplier. The complexity analysis of the proposed algorithms is also performed. In practical evaluation, the system compared with online and offline summarizers upon metrics like precision, recall, and F-measure. The system is tested through a non-clustering approach also in order to analyze the impact of clustering used in our work. Some existing strategies like question game evaluation, sentence rank evaluation, and keyword association are also done to evaluate the different parameters like the relevance of sentences, important entity words, etc.
doi_str_mv	10.1007/s12046-019-1248-0
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2344979304</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2344979304</sourcerecordid><originalsourceid>FETCH-LOGICAL-c316t-58b88c4bb49f1ce8d9652a7921afc8214328e7fef9989b58af8fff706ae357323</originalsourceid><addsrcrecordid>eNp1kEtLAzEUhYMoWKs_wF3AdTSvmSTLUnxBwY2uQyZNxpR51CRTrb_eGUZw5ereC-ecy_kAuCb4lmAs7hKhmJcIE4UI5RLhE7DASjAkSiFOx50WJaJcqXNwkdIOYyqwZAtQr2DXH1wD3VeOxuZwcDCPO0xD25oYvk0OfQfTMWXXws-Q32FyjUd9rE0XvkNXw9bsoW2GURCn03Rb6Loc8hFGZ_u6C1PCJTjzpknu6ncuwdvD_ev6CW1eHp_Xqw2yjJQZFbKS0vKq4soT6-RWlQU1QlFivJWUcEalE955paSqCmm89N4LXBrHCsEoW4KbOXcf-4_Bpax3_RC78aWmjHMlFMN8VJFZZWOfUnRe72MY6x41wXriqWeeeuSpJ54ajx46e9J-6uniX_L_ph8oJHqC</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2344979304</pqid></control><display><type>article</type><title>A novel extractive text summarization system with self-organizing map clustering and entity recognition</title><source>Indian Academy of Sciences</source><source>EZB-FREE-00999 freely available EZB journals</source><source>SpringerLink Journals - AutoHoldings</source><creator>RAHUL RAJ, M ; HAROON, ROSNA P ; SOBHANA, N V</creator><creatorcontrib>RAHUL RAJ, M ; HAROON, ROSNA P ; SOBHANA, N V</creatorcontrib><description>Extractive text summarization yields the sensitive parts of the document by neglecting the irrelevant and redundant information. In this paper, we propose a new strategy for extractive single-document summarization in Malayalam. Initially, entity recognition is done, followed by relevance analysis is made based on some context-aware features. The scored sentences are then clustered using self-organizing maps (SOM) and from these clusters, relevant sentences are extracted out based on the proposed algorithm. Both theoretical and practical evaluations are done to analyze the implemented system. In theoretical evaluation, gradient calculations of relevance equations are used to know that which of these sentence scoring features are contributing more. The relevance equation is optimized with the help of Lagrange’s multiplier. The complexity analysis of the proposed algorithms is also performed. In practical evaluation, the system compared with online and offline summarizers upon metrics like precision, recall, and F-measure. The system is tested through a non-clustering approach also in order to analyze the impact of clustering used in our work. Some existing strategies like question game evaluation, sentence rank evaluation, and keyword association are also done to evaluate the different parameters like the relevance of sentences, important entity words, etc.</description><identifier>ISSN: 0256-2499</identifier><identifier>EISSN: 0973-7677</identifier><identifier>DOI: 10.1007/s12046-019-1248-0</identifier><language>eng</language><publisher>New Delhi: Springer India</publisher><subject>Algorithms ; Clustering ; Engineering ; Evaluation ; Game theory ; Impact analysis ; On-line systems ; Recognition ; Self organizing maps ; Sentences</subject><ispartof>Sadhana (Bangalore), 2020-12, Vol.45 (1), Article 32</ispartof><rights>Indian Academy of Sciences 2020</rights><rights>Indian Academy of Sciences 2020.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c316t-58b88c4bb49f1ce8d9652a7921afc8214328e7fef9989b58af8fff706ae357323</citedby><cites>FETCH-LOGICAL-c316t-58b88c4bb49f1ce8d9652a7921afc8214328e7fef9989b58af8fff706ae357323</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s12046-019-1248-0$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s12046-019-1248-0$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,27922,27923,41486,42555,51317</link.rule.ids></links><search><creatorcontrib>RAHUL RAJ, M</creatorcontrib><creatorcontrib>HAROON, ROSNA P</creatorcontrib><creatorcontrib>SOBHANA, N V</creatorcontrib><title>A novel extractive text summarization system with self-organizing map clustering and entity recognition</title><title>Sadhana (Bangalore)</title><addtitle>Sādhanā</addtitle><description>Extractive text summarization yields the sensitive parts of the document by neglecting the irrelevant and redundant information. In this paper, we propose a new strategy for extractive single-document summarization in Malayalam. Initially, entity recognition is done, followed by relevance analysis is made based on some context-aware features. The scored sentences are then clustered using self-organizing maps (SOM) and from these clusters, relevant sentences are extracted out based on the proposed algorithm. Both theoretical and practical evaluations are done to analyze the implemented system. In theoretical evaluation, gradient calculations of relevance equations are used to know that which of these sentence scoring features are contributing more. The relevance equation is optimized with the help of Lagrange’s multiplier. The complexity analysis of the proposed algorithms is also performed. In practical evaluation, the system compared with online and offline summarizers upon metrics like precision, recall, and F-measure. The system is tested through a non-clustering approach also in order to analyze the impact of clustering used in our work. Some existing strategies like question game evaluation, sentence rank evaluation, and keyword association are also done to evaluate the different parameters like the relevance of sentences, important entity words, etc.</description><subject>Algorithms</subject><subject>Clustering</subject><subject>Engineering</subject><subject>Evaluation</subject><subject>Game theory</subject><subject>Impact analysis</subject><subject>On-line systems</subject><subject>Recognition</subject><subject>Self organizing maps</subject><subject>Sentences</subject><issn>0256-2499</issn><issn>0973-7677</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNp1kEtLAzEUhYMoWKs_wF3AdTSvmSTLUnxBwY2uQyZNxpR51CRTrb_eGUZw5ereC-ecy_kAuCb4lmAs7hKhmJcIE4UI5RLhE7DASjAkSiFOx50WJaJcqXNwkdIOYyqwZAtQr2DXH1wD3VeOxuZwcDCPO0xD25oYvk0OfQfTMWXXws-Q32FyjUd9rE0XvkNXw9bsoW2GURCn03Rb6Loc8hFGZ_u6C1PCJTjzpknu6ncuwdvD_ev6CW1eHp_Xqw2yjJQZFbKS0vKq4soT6-RWlQU1QlFivJWUcEalE955paSqCmm89N4LXBrHCsEoW4KbOXcf-4_Bpax3_RC78aWmjHMlFMN8VJFZZWOfUnRe72MY6x41wXriqWeeeuSpJ54ajx46e9J-6uniX_L_ph8oJHqC</recordid><startdate>20201201</startdate><enddate>20201201</enddate><creator>RAHUL RAJ, M</creator><creator>HAROON, ROSNA P</creator><creator>SOBHANA, N V</creator><general>Springer India</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20201201</creationdate><title>A novel extractive text summarization system with self-organizing map clustering and entity recognition</title><author>RAHUL RAJ, M ; HAROON, ROSNA P ; SOBHANA, N V</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c316t-58b88c4bb49f1ce8d9652a7921afc8214328e7fef9989b58af8fff706ae357323</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Algorithms</topic><topic>Clustering</topic><topic>Engineering</topic><topic>Evaluation</topic><topic>Game theory</topic><topic>Impact analysis</topic><topic>On-line systems</topic><topic>Recognition</topic><topic>Self organizing maps</topic><topic>Sentences</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>RAHUL RAJ, M</creatorcontrib><creatorcontrib>HAROON, ROSNA P</creatorcontrib><creatorcontrib>SOBHANA, N V</creatorcontrib><collection>CrossRef</collection><jtitle>Sadhana (Bangalore)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>RAHUL RAJ, M</au><au>HAROON, ROSNA P</au><au>SOBHANA, N V</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A novel extractive text summarization system with self-organizing map clustering and entity recognition</atitle><jtitle>Sadhana (Bangalore)</jtitle><stitle>Sādhanā</stitle><date>2020-12-01</date><risdate>2020</risdate><volume>45</volume><issue>1</issue><artnum>32</artnum><issn>0256-2499</issn><eissn>0973-7677</eissn><abstract>Extractive text summarization yields the sensitive parts of the document by neglecting the irrelevant and redundant information. In this paper, we propose a new strategy for extractive single-document summarization in Malayalam. Initially, entity recognition is done, followed by relevance analysis is made based on some context-aware features. The scored sentences are then clustered using self-organizing maps (SOM) and from these clusters, relevant sentences are extracted out based on the proposed algorithm. Both theoretical and practical evaluations are done to analyze the implemented system. In theoretical evaluation, gradient calculations of relevance equations are used to know that which of these sentence scoring features are contributing more. The relevance equation is optimized with the help of Lagrange’s multiplier. The complexity analysis of the proposed algorithms is also performed. In practical evaluation, the system compared with online and offline summarizers upon metrics like precision, recall, and F-measure. The system is tested through a non-clustering approach also in order to analyze the impact of clustering used in our work. Some existing strategies like question game evaluation, sentence rank evaluation, and keyword association are also done to evaluate the different parameters like the relevance of sentences, important entity words, etc.</abstract><cop>New Delhi</cop><pub>Springer India</pub><doi>10.1007/s12046-019-1248-0</doi></addata></record>
fulltext	fulltext
identifier	ISSN: 0256-2499
ispartof	Sadhana (Bangalore), 2020-12, Vol.45 (1), Article 32
issn	0256-2499 0973-7677
language	eng
recordid	cdi_proquest_journals_2344979304
source	Indian Academy of Sciences; EZB-FREE-00999 freely available EZB journals; SpringerLink Journals - AutoHoldings
subjects	Algorithms Clustering Engineering Evaluation Game theory Impact analysis On-line systems Recognition Self organizing maps Sentences
title	A novel extractive text summarization system with self-organizing map clustering and entity recognition
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-13T13%3A28%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20novel%20extractive%20text%20summarization%20system%20with%20self-organizing%20map%20clustering%20and%20entity%20recognition&rft.jtitle=Sadhana%20(Bangalore)&rft.au=RAHUL%20RAJ,%20M&rft.date=2020-12-01&rft.volume=45&rft.issue=1&rft.artnum=32&rft.issn=0256-2499&rft.eissn=0973-7677&rft_id=info:doi/10.1007/s12046-019-1248-0&rft_dat=%3Cproquest_cross%3E2344979304%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2344979304&rft_id=info:pmid/&rfr_iscdi=true