Optimal Text Document Clustering Enabled by Weighed Similarity Oriented Jaya With Grey Wolf Optimization Algorithm

Abstract Owing to scientific development, a variety of challenges present in the field of information retrieval. These challenges are because of the increased usage of large volumes of data. These huge amounts of data are presented from large-scale distributed networks. Centralization of these data...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Computer journal 2021-06, Vol.64 (6), p.960-972
Hauptverfasser:	Venkanna, Gugulothu, Bharati, Dr K F
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	972
container_issue	6
container_start_page	960
container_title	Computer journal
container_volume	64
creator	Venkanna, Gugulothu Bharati, Dr K F
description	Abstract Owing to scientific development, a variety of challenges present in the field of information retrieval. These challenges are because of the increased usage of large volumes of data. These huge amounts of data are presented from large-scale distributed networks. Centralization of these data to carry out analysis is tricky. There exists a requirement for novel text document clustering algorithms, which overcomes challenges in clustering. The two most important challenges in clustering are clustering accuracy and quality. For this reason, this paper intends to present an ideal clustering model for text document using term frequency–inverse document frequency, which is considered as feature sets. Here, the initial centroid selection is much concentrated which can automatically cluster the text using weighted similarity measure in the proposed clustering process. In fact, the weighted similarity function involves the inter-cluster, and intra-cluster similarity of both ordered and unordered documents, which is used to minimize weighted similarity among the documents. An advanced model for clustering is proposed by the hybrid optimization algorithm, which is the combination of the Jaya Algorithm (JA) and Grey Wolf Algorithm (GWO), and so the proposed algorithm is termed as JA-based GWO. Finally, the performance of the proposed model is verified through a comparative analysis with the state-of-the-art models. The performance analysis exhibits that the proposed model is 96.56% better than genetic algorithm, 99.46% better than particle swarm optimization, 97.09% superior to Dragonfly algorithm, and 96.21% better than JA for the similarity index. Therefore, the proposed model has confirmed its efficiency through valuable analysis.
doi_str_mv	10.1093/comjnl/bxab013
format	Article
fullrecord	<record><control><sourceid>oup_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1093_comjnl_bxab013</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/comjnl/bxab013</oup_id><sourcerecordid>10.1093/comjnl/bxab013</sourcerecordid><originalsourceid>FETCH-LOGICAL-c203t-855be72fb3fd4c4af9c34a05b828310661797c773dbeeb7db59fbaae0ef237633</originalsourceid><addsrcrecordid>eNqFkMFPwjAUhxujiYhePffqYfDabis7EkTUkHAQw3FpuxZKupV0JWH-9Uzh7um9vHy_X_I-hJ4JjAgUbKx8vW_cWJ6EBMJu0ICkOSQUcn6LBgAEkjSncI8e2nYPABSKfIDC6hBtLRxe61PEr14da91EPHPHNupgmy2eN0I6XWHZ4Y22212_ftnaOhFs7PAq2J7vb5-iE3hj4w4vgu5R7wz-67Y_Ilrf4Knb-j6yqx_RnRGu1U_XOUTfb_P17D1ZrhYfs-kyURRYTCZZJjWnRjJTpSoVplAsFZDJCZ0wAnlOeMEV56ySWkteyawwUggN2lDGc8aGaHTpVcG3bdCmPIT-1dCVBMpfY-XFWHk11gdeLgF_PPzHngFeMXIY</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Optimal Text Document Clustering Enabled by Weighed Similarity Oriented Jaya With Grey Wolf Optimization Algorithm</title><source>Oxford University Press Journals Current</source><creator>Venkanna, Gugulothu ; Bharati, Dr K F</creator><creatorcontrib>Venkanna, Gugulothu ; Bharati, Dr K F</creatorcontrib><description>Abstract Owing to scientific development, a variety of challenges present in the field of information retrieval. These challenges are because of the increased usage of large volumes of data. These huge amounts of data are presented from large-scale distributed networks. Centralization of these data to carry out analysis is tricky. There exists a requirement for novel text document clustering algorithms, which overcomes challenges in clustering. The two most important challenges in clustering are clustering accuracy and quality. For this reason, this paper intends to present an ideal clustering model for text document using term frequency–inverse document frequency, which is considered as feature sets. Here, the initial centroid selection is much concentrated which can automatically cluster the text using weighted similarity measure in the proposed clustering process. In fact, the weighted similarity function involves the inter-cluster, and intra-cluster similarity of both ordered and unordered documents, which is used to minimize weighted similarity among the documents. An advanced model for clustering is proposed by the hybrid optimization algorithm, which is the combination of the Jaya Algorithm (JA) and Grey Wolf Algorithm (GWO), and so the proposed algorithm is termed as JA-based GWO. Finally, the performance of the proposed model is verified through a comparative analysis with the state-of-the-art models. The performance analysis exhibits that the proposed model is 96.56% better than genetic algorithm, 99.46% better than particle swarm optimization, 97.09% superior to Dragonfly algorithm, and 96.21% better than JA for the similarity index. Therefore, the proposed model has confirmed its efficiency through valuable analysis.</description><identifier>ISSN: 0010-4620</identifier><identifier>EISSN: 1460-2067</identifier><identifier>DOI: 10.1093/comjnl/bxab013</identifier><language>eng</language><publisher>Oxford University Press</publisher><ispartof>Computer journal, 2021-06, Vol.64 (6), p.960-972</ispartof><rights>The Author(s) 2021. Published by Oxford University Press on behalf of The British Computer Society. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2021</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c203t-855be72fb3fd4c4af9c34a05b828310661797c773dbeeb7db59fbaae0ef237633</citedby><cites>FETCH-LOGICAL-c203t-855be72fb3fd4c4af9c34a05b828310661797c773dbeeb7db59fbaae0ef237633</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,1578,27901,27902</link.rule.ids></links><search><creatorcontrib>Venkanna, Gugulothu</creatorcontrib><creatorcontrib>Bharati, Dr K F</creatorcontrib><title>Optimal Text Document Clustering Enabled by Weighed Similarity Oriented Jaya With Grey Wolf Optimization Algorithm</title><title>Computer journal</title><description>Abstract Owing to scientific development, a variety of challenges present in the field of information retrieval. These challenges are because of the increased usage of large volumes of data. These huge amounts of data are presented from large-scale distributed networks. Centralization of these data to carry out analysis is tricky. There exists a requirement for novel text document clustering algorithms, which overcomes challenges in clustering. The two most important challenges in clustering are clustering accuracy and quality. For this reason, this paper intends to present an ideal clustering model for text document using term frequency–inverse document frequency, which is considered as feature sets. Here, the initial centroid selection is much concentrated which can automatically cluster the text using weighted similarity measure in the proposed clustering process. In fact, the weighted similarity function involves the inter-cluster, and intra-cluster similarity of both ordered and unordered documents, which is used to minimize weighted similarity among the documents. An advanced model for clustering is proposed by the hybrid optimization algorithm, which is the combination of the Jaya Algorithm (JA) and Grey Wolf Algorithm (GWO), and so the proposed algorithm is termed as JA-based GWO. Finally, the performance of the proposed model is verified through a comparative analysis with the state-of-the-art models. The performance analysis exhibits that the proposed model is 96.56% better than genetic algorithm, 99.46% better than particle swarm optimization, 97.09% superior to Dragonfly algorithm, and 96.21% better than JA for the similarity index. Therefore, the proposed model has confirmed its efficiency through valuable analysis.</description><issn>0010-4620</issn><issn>1460-2067</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNqFkMFPwjAUhxujiYhePffqYfDabis7EkTUkHAQw3FpuxZKupV0JWH-9Uzh7um9vHy_X_I-hJ4JjAgUbKx8vW_cWJ6EBMJu0ICkOSQUcn6LBgAEkjSncI8e2nYPABSKfIDC6hBtLRxe61PEr14da91EPHPHNupgmy2eN0I6XWHZ4Y22212_ftnaOhFs7PAq2J7vb5-iE3hj4w4vgu5R7wz-67Y_Ilrf4Knb-j6yqx_RnRGu1U_XOUTfb_P17D1ZrhYfs-kyURRYTCZZJjWnRjJTpSoVplAsFZDJCZ0wAnlOeMEV56ySWkteyawwUggN2lDGc8aGaHTpVcG3bdCmPIT-1dCVBMpfY-XFWHk11gdeLgF_PPzHngFeMXIY</recordid><startdate>20210619</startdate><enddate>20210619</enddate><creator>Venkanna, Gugulothu</creator><creator>Bharati, Dr K F</creator><general>Oxford University Press</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20210619</creationdate><title>Optimal Text Document Clustering Enabled by Weighed Similarity Oriented Jaya With Grey Wolf Optimization Algorithm</title><author>Venkanna, Gugulothu ; Bharati, Dr K F</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c203t-855be72fb3fd4c4af9c34a05b828310661797c773dbeeb7db59fbaae0ef237633</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Venkanna, Gugulothu</creatorcontrib><creatorcontrib>Bharati, Dr K F</creatorcontrib><collection>CrossRef</collection><jtitle>Computer journal</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Venkanna, Gugulothu</au><au>Bharati, Dr K F</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Optimal Text Document Clustering Enabled by Weighed Similarity Oriented Jaya With Grey Wolf Optimization Algorithm</atitle><jtitle>Computer journal</jtitle><date>2021-06-19</date><risdate>2021</risdate><volume>64</volume><issue>6</issue><spage>960</spage><epage>972</epage><pages>960-972</pages><issn>0010-4620</issn><eissn>1460-2067</eissn><abstract>Abstract Owing to scientific development, a variety of challenges present in the field of information retrieval. These challenges are because of the increased usage of large volumes of data. These huge amounts of data are presented from large-scale distributed networks. Centralization of these data to carry out analysis is tricky. There exists a requirement for novel text document clustering algorithms, which overcomes challenges in clustering. The two most important challenges in clustering are clustering accuracy and quality. For this reason, this paper intends to present an ideal clustering model for text document using term frequency–inverse document frequency, which is considered as feature sets. Here, the initial centroid selection is much concentrated which can automatically cluster the text using weighted similarity measure in the proposed clustering process. In fact, the weighted similarity function involves the inter-cluster, and intra-cluster similarity of both ordered and unordered documents, which is used to minimize weighted similarity among the documents. An advanced model for clustering is proposed by the hybrid optimization algorithm, which is the combination of the Jaya Algorithm (JA) and Grey Wolf Algorithm (GWO), and so the proposed algorithm is termed as JA-based GWO. Finally, the performance of the proposed model is verified through a comparative analysis with the state-of-the-art models. The performance analysis exhibits that the proposed model is 96.56% better than genetic algorithm, 99.46% better than particle swarm optimization, 97.09% superior to Dragonfly algorithm, and 96.21% better than JA for the similarity index. Therefore, the proposed model has confirmed its efficiency through valuable analysis.</abstract><pub>Oxford University Press</pub><doi>10.1093/comjnl/bxab013</doi><tpages>13</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0010-4620
ispartof	Computer journal, 2021-06, Vol.64 (6), p.960-972
issn	0010-4620 1460-2067
language	eng
recordid	cdi_crossref_primary_10_1093_comjnl_bxab013
source	Oxford University Press Journals Current
title	Optimal Text Document Clustering Enabled by Weighed Similarity Oriented Jaya With Grey Wolf Optimization Algorithm
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-01T01%3A10%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-oup_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Optimal%20Text%20Document%20Clustering%20Enabled%20by%20Weighed%20Similarity%20Oriented%20Jaya%20With%20Grey%20Wolf%20Optimization%20Algorithm&rft.jtitle=Computer%20journal&rft.au=Venkanna,%20Gugulothu&rft.date=2021-06-19&rft.volume=64&rft.issue=6&rft.spage=960&rft.epage=972&rft.pages=960-972&rft.issn=0010-4620&rft.eissn=1460-2067&rft_id=info:doi/10.1093/comjnl/bxab013&rft_dat=%3Coup_cross%3E10.1093/comjnl/bxab013%3C/oup_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_oup_id=10.1093/comjnl/bxab013&rfr_iscdi=true