Amplifying document categorization with advanced features and deep learning

The field of natural language processing (NLP) plays a pivotal role in discerning unstructured data from diverse origins. This study employs advanced techniques rooted in machine learning and deep learning to effectively categorize news articles. Notably, deep learning models have demonstrated super...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Multimedia tools and applications 2024-03, Vol.83 (26), p.68087-68105
Hauptverfasser: Kavitha, M., Akila, K.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 68105
container_issue 26
container_start_page 68087
container_title Multimedia tools and applications
container_volume 83
creator Kavitha, M.
Akila, K.
description The field of natural language processing (NLP) plays a pivotal role in discerning unstructured data from diverse origins. This study employs advanced techniques rooted in machine learning and deep learning to effectively categorize news articles. Notably, deep learning models have demonstrated superior performance over traditional machine learning algorithms, rendering them a popular choice for a range of NLP tasks. The research employs feature extraction techniques to identify multiword tokens, negation words, and out-of-vocabulary words and replace them. Additionally, convolutional neural network models leverage embedding, convolutional layers, and max pooling layers to capture intricate features. For tasks requiring an understanding of dependencies among long phrases, long short-term memory models come into play. The evaluation of the proposed model hinges on training it with datasets like AG News, BBC, and 20 Newsgroup, gauging its efficacy. The study delves into the myriad challenges inherent to text classification. These challenges are thoughtfully discussed, shedding light on the intricacies of the process. Furthermore, the research furnishes comprehensive test outcomes for both conventional machine learning and deep learning models. The significance of this proposed model is that it uses a multiword expression lexicon, wordnet synset, and word embedding techniques for feature extraction. The performance of the models is increased when using these feature extraction techniques.
doi_str_mv 10.1007/s11042-024-18483-7
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3083014300</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3083014300</sourcerecordid><originalsourceid>FETCH-LOGICAL-c270t-d251886bc7838056ae63c1f2dcbd72c485cfc9dc4f4b258ed962e63f6a1b1f5e3</originalsourceid><addsrcrecordid>eNp9kDtPwzAUhS0EEqXwB5gsMQf8SuyOVcVLVGKB2XLs65CqdYKdgMqvxxAkmJjuGb5zrvQhdE7JJSVEXiVKiWAFYaKgSiheyAM0o6XMQTJ6-Ccfo5OUNoTQqmRihh6Wu37b-n0bGuw6O-4gDNiaAZouth9maLuA39vhBRv3ZoIFhz2YYYyQsAkOO4Aeb8HEkAdO0ZE32wRnP3eOnm-un1Z3xfrx9n61XBeWSTIUjpVUqaq2UnFFyspAxS31zNnaSWaFKq23C2eFFzUrFbhFxTLiK0Nr6kvgc3Qx7faxex0hDXrTjTHkl5oTxQkVnJBMsYmysUspgtd9bHcm7jUl-kuanqTpLE1_S9Myl_hUShkODcTf6X9an4X4cDg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3083014300</pqid></control><display><type>article</type><title>Amplifying document categorization with advanced features and deep learning</title><source>SpringerLink Journals - AutoHoldings</source><creator>Kavitha, M. ; Akila, K.</creator><creatorcontrib>Kavitha, M. ; Akila, K.</creatorcontrib><description>The field of natural language processing (NLP) plays a pivotal role in discerning unstructured data from diverse origins. This study employs advanced techniques rooted in machine learning and deep learning to effectively categorize news articles. Notably, deep learning models have demonstrated superior performance over traditional machine learning algorithms, rendering them a popular choice for a range of NLP tasks. The research employs feature extraction techniques to identify multiword tokens, negation words, and out-of-vocabulary words and replace them. Additionally, convolutional neural network models leverage embedding, convolutional layers, and max pooling layers to capture intricate features. For tasks requiring an understanding of dependencies among long phrases, long short-term memory models come into play. The evaluation of the proposed model hinges on training it with datasets like AG News, BBC, and 20 Newsgroup, gauging its efficacy. The study delves into the myriad challenges inherent to text classification. These challenges are thoughtfully discussed, shedding light on the intricacies of the process. Furthermore, the research furnishes comprehensive test outcomes for both conventional machine learning and deep learning models. The significance of this proposed model is that it uses a multiword expression lexicon, wordnet synset, and word embedding techniques for feature extraction. The performance of the models is increased when using these feature extraction techniques.</description><identifier>ISSN: 1573-7721</identifier><identifier>ISSN: 1380-7501</identifier><identifier>EISSN: 1573-7721</identifier><identifier>DOI: 10.1007/s11042-024-18483-7</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Algorithms ; Artificial neural networks ; Computer Communication Networks ; Computer Science ; Data Structures and Information Theory ; Deep learning ; Embedding ; Feature extraction ; Machine learning ; Memory tasks ; Multimedia Information Systems ; Natural language processing ; Special Purpose and Application-Based Systems ; Unstructured data ; Words (language)</subject><ispartof>Multimedia tools and applications, 2024-03, Vol.83 (26), p.68087-68105</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c270t-d251886bc7838056ae63c1f2dcbd72c485cfc9dc4f4b258ed962e63f6a1b1f5e3</cites><orcidid>0000-0002-1979-6809</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11042-024-18483-7$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11042-024-18483-7$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Kavitha, M.</creatorcontrib><creatorcontrib>Akila, K.</creatorcontrib><title>Amplifying document categorization with advanced features and deep learning</title><title>Multimedia tools and applications</title><addtitle>Multimed Tools Appl</addtitle><description>The field of natural language processing (NLP) plays a pivotal role in discerning unstructured data from diverse origins. This study employs advanced techniques rooted in machine learning and deep learning to effectively categorize news articles. Notably, deep learning models have demonstrated superior performance over traditional machine learning algorithms, rendering them a popular choice for a range of NLP tasks. The research employs feature extraction techniques to identify multiword tokens, negation words, and out-of-vocabulary words and replace them. Additionally, convolutional neural network models leverage embedding, convolutional layers, and max pooling layers to capture intricate features. For tasks requiring an understanding of dependencies among long phrases, long short-term memory models come into play. The evaluation of the proposed model hinges on training it with datasets like AG News, BBC, and 20 Newsgroup, gauging its efficacy. The study delves into the myriad challenges inherent to text classification. These challenges are thoughtfully discussed, shedding light on the intricacies of the process. Furthermore, the research furnishes comprehensive test outcomes for both conventional machine learning and deep learning models. The significance of this proposed model is that it uses a multiword expression lexicon, wordnet synset, and word embedding techniques for feature extraction. The performance of the models is increased when using these feature extraction techniques.</description><subject>Algorithms</subject><subject>Artificial neural networks</subject><subject>Computer Communication Networks</subject><subject>Computer Science</subject><subject>Data Structures and Information Theory</subject><subject>Deep learning</subject><subject>Embedding</subject><subject>Feature extraction</subject><subject>Machine learning</subject><subject>Memory tasks</subject><subject>Multimedia Information Systems</subject><subject>Natural language processing</subject><subject>Special Purpose and Application-Based Systems</subject><subject>Unstructured data</subject><subject>Words (language)</subject><issn>1573-7721</issn><issn>1380-7501</issn><issn>1573-7721</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kDtPwzAUhS0EEqXwB5gsMQf8SuyOVcVLVGKB2XLs65CqdYKdgMqvxxAkmJjuGb5zrvQhdE7JJSVEXiVKiWAFYaKgSiheyAM0o6XMQTJ6-Ccfo5OUNoTQqmRihh6Wu37b-n0bGuw6O-4gDNiaAZouth9maLuA39vhBRv3ZoIFhz2YYYyQsAkOO4Aeb8HEkAdO0ZE32wRnP3eOnm-un1Z3xfrx9n61XBeWSTIUjpVUqaq2UnFFyspAxS31zNnaSWaFKq23C2eFFzUrFbhFxTLiK0Nr6kvgc3Qx7faxex0hDXrTjTHkl5oTxQkVnJBMsYmysUspgtd9bHcm7jUl-kuanqTpLE1_S9Myl_hUShkODcTf6X9an4X4cDg</recordid><startdate>20240304</startdate><enddate>20240304</enddate><creator>Kavitha, M.</creator><creator>Akila, K.</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-1979-6809</orcidid></search><sort><creationdate>20240304</creationdate><title>Amplifying document categorization with advanced features and deep learning</title><author>Kavitha, M. ; Akila, K.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c270t-d251886bc7838056ae63c1f2dcbd72c485cfc9dc4f4b258ed962e63f6a1b1f5e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Artificial neural networks</topic><topic>Computer Communication Networks</topic><topic>Computer Science</topic><topic>Data Structures and Information Theory</topic><topic>Deep learning</topic><topic>Embedding</topic><topic>Feature extraction</topic><topic>Machine learning</topic><topic>Memory tasks</topic><topic>Multimedia Information Systems</topic><topic>Natural language processing</topic><topic>Special Purpose and Application-Based Systems</topic><topic>Unstructured data</topic><topic>Words (language)</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kavitha, M.</creatorcontrib><creatorcontrib>Akila, K.</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Multimedia tools and applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kavitha, M.</au><au>Akila, K.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Amplifying document categorization with advanced features and deep learning</atitle><jtitle>Multimedia tools and applications</jtitle><stitle>Multimed Tools Appl</stitle><date>2024-03-04</date><risdate>2024</risdate><volume>83</volume><issue>26</issue><spage>68087</spage><epage>68105</epage><pages>68087-68105</pages><issn>1573-7721</issn><issn>1380-7501</issn><eissn>1573-7721</eissn><abstract>The field of natural language processing (NLP) plays a pivotal role in discerning unstructured data from diverse origins. This study employs advanced techniques rooted in machine learning and deep learning to effectively categorize news articles. Notably, deep learning models have demonstrated superior performance over traditional machine learning algorithms, rendering them a popular choice for a range of NLP tasks. The research employs feature extraction techniques to identify multiword tokens, negation words, and out-of-vocabulary words and replace them. Additionally, convolutional neural network models leverage embedding, convolutional layers, and max pooling layers to capture intricate features. For tasks requiring an understanding of dependencies among long phrases, long short-term memory models come into play. The evaluation of the proposed model hinges on training it with datasets like AG News, BBC, and 20 Newsgroup, gauging its efficacy. The study delves into the myriad challenges inherent to text classification. These challenges are thoughtfully discussed, shedding light on the intricacies of the process. Furthermore, the research furnishes comprehensive test outcomes for both conventional machine learning and deep learning models. The significance of this proposed model is that it uses a multiword expression lexicon, wordnet synset, and word embedding techniques for feature extraction. The performance of the models is increased when using these feature extraction techniques.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11042-024-18483-7</doi><tpages>19</tpages><orcidid>https://orcid.org/0000-0002-1979-6809</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 1573-7721
ispartof Multimedia tools and applications, 2024-03, Vol.83 (26), p.68087-68105
issn 1573-7721
1380-7501
1573-7721
language eng
recordid cdi_proquest_journals_3083014300
source SpringerLink Journals - AutoHoldings
subjects Algorithms
Artificial neural networks
Computer Communication Networks
Computer Science
Data Structures and Information Theory
Deep learning
Embedding
Feature extraction
Machine learning
Memory tasks
Multimedia Information Systems
Natural language processing
Special Purpose and Application-Based Systems
Unstructured data
Words (language)
title Amplifying document categorization with advanced features and deep learning
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-04T18%3A29%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Amplifying%20document%20categorization%20with%20advanced%20features%20and%20deep%20learning&rft.jtitle=Multimedia%20tools%20and%20applications&rft.au=Kavitha,%20M.&rft.date=2024-03-04&rft.volume=83&rft.issue=26&rft.spage=68087&rft.epage=68105&rft.pages=68087-68105&rft.issn=1573-7721&rft.eissn=1573-7721&rft_id=info:doi/10.1007/s11042-024-18483-7&rft_dat=%3Cproquest_cross%3E3083014300%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3083014300&rft_id=info:pmid/&rfr_iscdi=true