Optimal Text Document Clustering Enabled by Weighed Similarity Oriented Jaya With Grey Wolf Optimization Algorithm

Abstract Owing to scientific development, a variety of challenges present in the field of information retrieval. These challenges are because of the increased usage of large volumes of data. These huge amounts of data are presented from large-scale distributed networks. Centralization of these data...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computer journal 2021-06, Vol.64 (6), p.960-972
Hauptverfasser: Venkanna, Gugulothu, Bharati, Dr K F
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 972
container_issue 6
container_start_page 960
container_title Computer journal
container_volume 64
creator Venkanna, Gugulothu
Bharati, Dr K F
description Abstract Owing to scientific development, a variety of challenges present in the field of information retrieval. These challenges are because of the increased usage of large volumes of data. These huge amounts of data are presented from large-scale distributed networks. Centralization of these data to carry out analysis is tricky. There exists a requirement for novel text document clustering algorithms, which overcomes challenges in clustering. The two most important challenges in clustering are clustering accuracy and quality. For this reason, this paper intends to present an ideal clustering model for text document using term frequency–inverse document frequency, which is considered as feature sets. Here, the initial centroid selection is much concentrated which can automatically cluster the text using weighted similarity measure in the proposed clustering process. In fact, the weighted similarity function involves the inter-cluster, and intra-cluster similarity of both ordered and unordered documents, which is used to minimize weighted similarity among the documents. An advanced model for clustering is proposed by the hybrid optimization algorithm, which is the combination of the Jaya Algorithm (JA) and Grey Wolf Algorithm (GWO), and so the proposed algorithm is termed as JA-based GWO. Finally, the performance of the proposed model is verified through a comparative analysis with the state-of-the-art models. The performance analysis exhibits that the proposed model is 96.56% better than genetic algorithm, 99.46% better than particle swarm optimization, 97.09% superior to Dragonfly algorithm, and 96.21% better than JA for the similarity index. Therefore, the proposed model has confirmed its efficiency through valuable analysis.
doi_str_mv 10.1093/comjnl/bxab013
format Article
fullrecord <record><control><sourceid>oup_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1093_comjnl_bxab013</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/comjnl/bxab013</oup_id><sourcerecordid>10.1093/comjnl/bxab013</sourcerecordid><originalsourceid>FETCH-LOGICAL-c203t-855be72fb3fd4c4af9c34a05b828310661797c773dbeeb7db59fbaae0ef237633</originalsourceid><addsrcrecordid>eNqFkMFPwjAUhxujiYhePffqYfDabis7EkTUkHAQw3FpuxZKupV0JWH-9Uzh7um9vHy_X_I-hJ4JjAgUbKx8vW_cWJ6EBMJu0ICkOSQUcn6LBgAEkjSncI8e2nYPABSKfIDC6hBtLRxe61PEr14da91EPHPHNupgmy2eN0I6XWHZ4Y22212_ftnaOhFs7PAq2J7vb5-iE3hj4w4vgu5R7wz-67Y_Ilrf4Knb-j6yqx_RnRGu1U_XOUTfb_P17D1ZrhYfs-kyURRYTCZZJjWnRjJTpSoVplAsFZDJCZ0wAnlOeMEV56ySWkteyawwUggN2lDGc8aGaHTpVcG3bdCmPIT-1dCVBMpfY-XFWHk11gdeLgF_PPzHngFeMXIY</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Optimal Text Document Clustering Enabled by Weighed Similarity Oriented Jaya With Grey Wolf Optimization Algorithm</title><source>Oxford University Press Journals Current</source><creator>Venkanna, Gugulothu ; Bharati, Dr K F</creator><creatorcontrib>Venkanna, Gugulothu ; Bharati, Dr K F</creatorcontrib><description>Abstract Owing to scientific development, a variety of challenges present in the field of information retrieval. These challenges are because of the increased usage of large volumes of data. These huge amounts of data are presented from large-scale distributed networks. Centralization of these data to carry out analysis is tricky. There exists a requirement for novel text document clustering algorithms, which overcomes challenges in clustering. The two most important challenges in clustering are clustering accuracy and quality. For this reason, this paper intends to present an ideal clustering model for text document using term frequency–inverse document frequency, which is considered as feature sets. Here, the initial centroid selection is much concentrated which can automatically cluster the text using weighted similarity measure in the proposed clustering process. In fact, the weighted similarity function involves the inter-cluster, and intra-cluster similarity of both ordered and unordered documents, which is used to minimize weighted similarity among the documents. An advanced model for clustering is proposed by the hybrid optimization algorithm, which is the combination of the Jaya Algorithm (JA) and Grey Wolf Algorithm (GWO), and so the proposed algorithm is termed as JA-based GWO. Finally, the performance of the proposed model is verified through a comparative analysis with the state-of-the-art models. The performance analysis exhibits that the proposed model is 96.56% better than genetic algorithm, 99.46% better than particle swarm optimization, 97.09% superior to Dragonfly algorithm, and 96.21% better than JA for the similarity index. Therefore, the proposed model has confirmed its efficiency through valuable analysis.</description><identifier>ISSN: 0010-4620</identifier><identifier>EISSN: 1460-2067</identifier><identifier>DOI: 10.1093/comjnl/bxab013</identifier><language>eng</language><publisher>Oxford University Press</publisher><ispartof>Computer journal, 2021-06, Vol.64 (6), p.960-972</ispartof><rights>The Author(s) 2021. Published by Oxford University Press on behalf of The British Computer Society. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2021</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c203t-855be72fb3fd4c4af9c34a05b828310661797c773dbeeb7db59fbaae0ef237633</citedby><cites>FETCH-LOGICAL-c203t-855be72fb3fd4c4af9c34a05b828310661797c773dbeeb7db59fbaae0ef237633</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,1578,27901,27902</link.rule.ids></links><search><creatorcontrib>Venkanna, Gugulothu</creatorcontrib><creatorcontrib>Bharati, Dr K F</creatorcontrib><title>Optimal Text Document Clustering Enabled by Weighed Similarity Oriented Jaya With Grey Wolf Optimization Algorithm</title><title>Computer journal</title><description>Abstract Owing to scientific development, a variety of challenges present in the field of information retrieval. These challenges are because of the increased usage of large volumes of data. These huge amounts of data are presented from large-scale distributed networks. Centralization of these data to carry out analysis is tricky. There exists a requirement for novel text document clustering algorithms, which overcomes challenges in clustering. The two most important challenges in clustering are clustering accuracy and quality. For this reason, this paper intends to present an ideal clustering model for text document using term frequency–inverse document frequency, which is considered as feature sets. Here, the initial centroid selection is much concentrated which can automatically cluster the text using weighted similarity measure in the proposed clustering process. In fact, the weighted similarity function involves the inter-cluster, and intra-cluster similarity of both ordered and unordered documents, which is used to minimize weighted similarity among the documents. An advanced model for clustering is proposed by the hybrid optimization algorithm, which is the combination of the Jaya Algorithm (JA) and Grey Wolf Algorithm (GWO), and so the proposed algorithm is termed as JA-based GWO. Finally, the performance of the proposed model is verified through a comparative analysis with the state-of-the-art models. The performance analysis exhibits that the proposed model is 96.56% better than genetic algorithm, 99.46% better than particle swarm optimization, 97.09% superior to Dragonfly algorithm, and 96.21% better than JA for the similarity index. Therefore, the proposed model has confirmed its efficiency through valuable analysis.</description><issn>0010-4620</issn><issn>1460-2067</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNqFkMFPwjAUhxujiYhePffqYfDabis7EkTUkHAQw3FpuxZKupV0JWH-9Uzh7um9vHy_X_I-hJ4JjAgUbKx8vW_cWJ6EBMJu0ICkOSQUcn6LBgAEkjSncI8e2nYPABSKfIDC6hBtLRxe61PEr14da91EPHPHNupgmy2eN0I6XWHZ4Y22212_ftnaOhFs7PAq2J7vb5-iE3hj4w4vgu5R7wz-67Y_Ilrf4Knb-j6yqx_RnRGu1U_XOUTfb_P17D1ZrhYfs-kyURRYTCZZJjWnRjJTpSoVplAsFZDJCZ0wAnlOeMEV56ySWkteyawwUggN2lDGc8aGaHTpVcG3bdCmPIT-1dCVBMpfY-XFWHk11gdeLgF_PPzHngFeMXIY</recordid><startdate>20210619</startdate><enddate>20210619</enddate><creator>Venkanna, Gugulothu</creator><creator>Bharati, Dr K F</creator><general>Oxford University Press</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20210619</creationdate><title>Optimal Text Document Clustering Enabled by Weighed Similarity Oriented Jaya With Grey Wolf Optimization Algorithm</title><author>Venkanna, Gugulothu ; Bharati, Dr K F</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c203t-855be72fb3fd4c4af9c34a05b828310661797c773dbeeb7db59fbaae0ef237633</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Venkanna, Gugulothu</creatorcontrib><creatorcontrib>Bharati, Dr K F</creatorcontrib><collection>CrossRef</collection><jtitle>Computer journal</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Venkanna, Gugulothu</au><au>Bharati, Dr K F</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Optimal Text Document Clustering Enabled by Weighed Similarity Oriented Jaya With Grey Wolf Optimization Algorithm</atitle><jtitle>Computer journal</jtitle><date>2021-06-19</date><risdate>2021</risdate><volume>64</volume><issue>6</issue><spage>960</spage><epage>972</epage><pages>960-972</pages><issn>0010-4620</issn><eissn>1460-2067</eissn><abstract>Abstract Owing to scientific development, a variety of challenges present in the field of information retrieval. These challenges are because of the increased usage of large volumes of data. These huge amounts of data are presented from large-scale distributed networks. Centralization of these data to carry out analysis is tricky. There exists a requirement for novel text document clustering algorithms, which overcomes challenges in clustering. The two most important challenges in clustering are clustering accuracy and quality. For this reason, this paper intends to present an ideal clustering model for text document using term frequency–inverse document frequency, which is considered as feature sets. Here, the initial centroid selection is much concentrated which can automatically cluster the text using weighted similarity measure in the proposed clustering process. In fact, the weighted similarity function involves the inter-cluster, and intra-cluster similarity of both ordered and unordered documents, which is used to minimize weighted similarity among the documents. An advanced model for clustering is proposed by the hybrid optimization algorithm, which is the combination of the Jaya Algorithm (JA) and Grey Wolf Algorithm (GWO), and so the proposed algorithm is termed as JA-based GWO. Finally, the performance of the proposed model is verified through a comparative analysis with the state-of-the-art models. The performance analysis exhibits that the proposed model is 96.56% better than genetic algorithm, 99.46% better than particle swarm optimization, 97.09% superior to Dragonfly algorithm, and 96.21% better than JA for the similarity index. Therefore, the proposed model has confirmed its efficiency through valuable analysis.</abstract><pub>Oxford University Press</pub><doi>10.1093/comjnl/bxab013</doi><tpages>13</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0010-4620
ispartof Computer journal, 2021-06, Vol.64 (6), p.960-972
issn 0010-4620
1460-2067
language eng
recordid cdi_crossref_primary_10_1093_comjnl_bxab013
source Oxford University Press Journals Current
title Optimal Text Document Clustering Enabled by Weighed Similarity Oriented Jaya With Grey Wolf Optimization Algorithm
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-01T01%3A10%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-oup_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Optimal%20Text%20Document%20Clustering%20Enabled%20by%20Weighed%20Similarity%20Oriented%20Jaya%20With%20Grey%20Wolf%20Optimization%20Algorithm&rft.jtitle=Computer%20journal&rft.au=Venkanna,%20Gugulothu&rft.date=2021-06-19&rft.volume=64&rft.issue=6&rft.spage=960&rft.epage=972&rft.pages=960-972&rft.issn=0010-4620&rft.eissn=1460-2067&rft_id=info:doi/10.1093/comjnl/bxab013&rft_dat=%3Coup_cross%3E10.1093/comjnl/bxab013%3C/oup_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_oup_id=10.1093/comjnl/bxab013&rfr_iscdi=true