Amplifying document categorization with advanced features and deep learning
The field of natural language processing (NLP) plays a pivotal role in discerning unstructured data from diverse origins. This study employs advanced techniques rooted in machine learning and deep learning to effectively categorize news articles. Notably, deep learning models have demonstrated super...
Gespeichert in:
Veröffentlicht in: | Multimedia tools and applications 2024-03, Vol.83 (26), p.68087-68105 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 68105 |
---|---|
container_issue | 26 |
container_start_page | 68087 |
container_title | Multimedia tools and applications |
container_volume | 83 |
creator | Kavitha, M. Akila, K. |
description | The field of natural language processing (NLP) plays a pivotal role in discerning unstructured data from diverse origins. This study employs advanced techniques rooted in machine learning and deep learning to effectively categorize news articles. Notably, deep learning models have demonstrated superior performance over traditional machine learning algorithms, rendering them a popular choice for a range of NLP tasks. The research employs feature extraction techniques to identify multiword tokens, negation words, and out-of-vocabulary words and replace them. Additionally, convolutional neural network models leverage embedding, convolutional layers, and max pooling layers to capture intricate features. For tasks requiring an understanding of dependencies among long phrases, long short-term memory models come into play. The evaluation of the proposed model hinges on training it with datasets like AG News, BBC, and 20 Newsgroup, gauging its efficacy. The study delves into the myriad challenges inherent to text classification. These challenges are thoughtfully discussed, shedding light on the intricacies of the process. Furthermore, the research furnishes comprehensive test outcomes for both conventional machine learning and deep learning models. The significance of this proposed model is that it uses a multiword expression lexicon, wordnet synset, and word embedding techniques for feature extraction. The performance of the models is increased when using these feature extraction techniques. |
doi_str_mv | 10.1007/s11042-024-18483-7 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3083014300</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3083014300</sourcerecordid><originalsourceid>FETCH-LOGICAL-c270t-d251886bc7838056ae63c1f2dcbd72c485cfc9dc4f4b258ed962e63f6a1b1f5e3</originalsourceid><addsrcrecordid>eNp9kDtPwzAUhS0EEqXwB5gsMQf8SuyOVcVLVGKB2XLs65CqdYKdgMqvxxAkmJjuGb5zrvQhdE7JJSVEXiVKiWAFYaKgSiheyAM0o6XMQTJ6-Ccfo5OUNoTQqmRihh6Wu37b-n0bGuw6O-4gDNiaAZouth9maLuA39vhBRv3ZoIFhz2YYYyQsAkOO4Aeb8HEkAdO0ZE32wRnP3eOnm-un1Z3xfrx9n61XBeWSTIUjpVUqaq2UnFFyspAxS31zNnaSWaFKq23C2eFFzUrFbhFxTLiK0Nr6kvgc3Qx7faxex0hDXrTjTHkl5oTxQkVnJBMsYmysUspgtd9bHcm7jUl-kuanqTpLE1_S9Myl_hUShkODcTf6X9an4X4cDg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3083014300</pqid></control><display><type>article</type><title>Amplifying document categorization with advanced features and deep learning</title><source>SpringerLink Journals - AutoHoldings</source><creator>Kavitha, M. ; Akila, K.</creator><creatorcontrib>Kavitha, M. ; Akila, K.</creatorcontrib><description>The field of natural language processing (NLP) plays a pivotal role in discerning unstructured data from diverse origins. This study employs advanced techniques rooted in machine learning and deep learning to effectively categorize news articles. Notably, deep learning models have demonstrated superior performance over traditional machine learning algorithms, rendering them a popular choice for a range of NLP tasks. The research employs feature extraction techniques to identify multiword tokens, negation words, and out-of-vocabulary words and replace them. Additionally, convolutional neural network models leverage embedding, convolutional layers, and max pooling layers to capture intricate features. For tasks requiring an understanding of dependencies among long phrases, long short-term memory models come into play. The evaluation of the proposed model hinges on training it with datasets like AG News, BBC, and 20 Newsgroup, gauging its efficacy. The study delves into the myriad challenges inherent to text classification. These challenges are thoughtfully discussed, shedding light on the intricacies of the process. Furthermore, the research furnishes comprehensive test outcomes for both conventional machine learning and deep learning models. The significance of this proposed model is that it uses a multiword expression lexicon, wordnet synset, and word embedding techniques for feature extraction. The performance of the models is increased when using these feature extraction techniques.</description><identifier>ISSN: 1573-7721</identifier><identifier>ISSN: 1380-7501</identifier><identifier>EISSN: 1573-7721</identifier><identifier>DOI: 10.1007/s11042-024-18483-7</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Algorithms ; Artificial neural networks ; Computer Communication Networks ; Computer Science ; Data Structures and Information Theory ; Deep learning ; Embedding ; Feature extraction ; Machine learning ; Memory tasks ; Multimedia Information Systems ; Natural language processing ; Special Purpose and Application-Based Systems ; Unstructured data ; Words (language)</subject><ispartof>Multimedia tools and applications, 2024-03, Vol.83 (26), p.68087-68105</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c270t-d251886bc7838056ae63c1f2dcbd72c485cfc9dc4f4b258ed962e63f6a1b1f5e3</cites><orcidid>0000-0002-1979-6809</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11042-024-18483-7$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11042-024-18483-7$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Kavitha, M.</creatorcontrib><creatorcontrib>Akila, K.</creatorcontrib><title>Amplifying document categorization with advanced features and deep learning</title><title>Multimedia tools and applications</title><addtitle>Multimed Tools Appl</addtitle><description>The field of natural language processing (NLP) plays a pivotal role in discerning unstructured data from diverse origins. This study employs advanced techniques rooted in machine learning and deep learning to effectively categorize news articles. Notably, deep learning models have demonstrated superior performance over traditional machine learning algorithms, rendering them a popular choice for a range of NLP tasks. The research employs feature extraction techniques to identify multiword tokens, negation words, and out-of-vocabulary words and replace them. Additionally, convolutional neural network models leverage embedding, convolutional layers, and max pooling layers to capture intricate features. For tasks requiring an understanding of dependencies among long phrases, long short-term memory models come into play. The evaluation of the proposed model hinges on training it with datasets like AG News, BBC, and 20 Newsgroup, gauging its efficacy. The study delves into the myriad challenges inherent to text classification. These challenges are thoughtfully discussed, shedding light on the intricacies of the process. Furthermore, the research furnishes comprehensive test outcomes for both conventional machine learning and deep learning models. The significance of this proposed model is that it uses a multiword expression lexicon, wordnet synset, and word embedding techniques for feature extraction. The performance of the models is increased when using these feature extraction techniques.</description><subject>Algorithms</subject><subject>Artificial neural networks</subject><subject>Computer Communication Networks</subject><subject>Computer Science</subject><subject>Data Structures and Information Theory</subject><subject>Deep learning</subject><subject>Embedding</subject><subject>Feature extraction</subject><subject>Machine learning</subject><subject>Memory tasks</subject><subject>Multimedia Information Systems</subject><subject>Natural language processing</subject><subject>Special Purpose and Application-Based Systems</subject><subject>Unstructured data</subject><subject>Words (language)</subject><issn>1573-7721</issn><issn>1380-7501</issn><issn>1573-7721</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kDtPwzAUhS0EEqXwB5gsMQf8SuyOVcVLVGKB2XLs65CqdYKdgMqvxxAkmJjuGb5zrvQhdE7JJSVEXiVKiWAFYaKgSiheyAM0o6XMQTJ6-Ccfo5OUNoTQqmRihh6Wu37b-n0bGuw6O-4gDNiaAZouth9maLuA39vhBRv3ZoIFhz2YYYyQsAkOO4Aeb8HEkAdO0ZE32wRnP3eOnm-un1Z3xfrx9n61XBeWSTIUjpVUqaq2UnFFyspAxS31zNnaSWaFKq23C2eFFzUrFbhFxTLiK0Nr6kvgc3Qx7faxex0hDXrTjTHkl5oTxQkVnJBMsYmysUspgtd9bHcm7jUl-kuanqTpLE1_S9Myl_hUShkODcTf6X9an4X4cDg</recordid><startdate>20240304</startdate><enddate>20240304</enddate><creator>Kavitha, M.</creator><creator>Akila, K.</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-1979-6809</orcidid></search><sort><creationdate>20240304</creationdate><title>Amplifying document categorization with advanced features and deep learning</title><author>Kavitha, M. ; Akila, K.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c270t-d251886bc7838056ae63c1f2dcbd72c485cfc9dc4f4b258ed962e63f6a1b1f5e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Artificial neural networks</topic><topic>Computer Communication Networks</topic><topic>Computer Science</topic><topic>Data Structures and Information Theory</topic><topic>Deep learning</topic><topic>Embedding</topic><topic>Feature extraction</topic><topic>Machine learning</topic><topic>Memory tasks</topic><topic>Multimedia Information Systems</topic><topic>Natural language processing</topic><topic>Special Purpose and Application-Based Systems</topic><topic>Unstructured data</topic><topic>Words (language)</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kavitha, M.</creatorcontrib><creatorcontrib>Akila, K.</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Multimedia tools and applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kavitha, M.</au><au>Akila, K.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Amplifying document categorization with advanced features and deep learning</atitle><jtitle>Multimedia tools and applications</jtitle><stitle>Multimed Tools Appl</stitle><date>2024-03-04</date><risdate>2024</risdate><volume>83</volume><issue>26</issue><spage>68087</spage><epage>68105</epage><pages>68087-68105</pages><issn>1573-7721</issn><issn>1380-7501</issn><eissn>1573-7721</eissn><abstract>The field of natural language processing (NLP) plays a pivotal role in discerning unstructured data from diverse origins. This study employs advanced techniques rooted in machine learning and deep learning to effectively categorize news articles. Notably, deep learning models have demonstrated superior performance over traditional machine learning algorithms, rendering them a popular choice for a range of NLP tasks. The research employs feature extraction techniques to identify multiword tokens, negation words, and out-of-vocabulary words and replace them. Additionally, convolutional neural network models leverage embedding, convolutional layers, and max pooling layers to capture intricate features. For tasks requiring an understanding of dependencies among long phrases, long short-term memory models come into play. The evaluation of the proposed model hinges on training it with datasets like AG News, BBC, and 20 Newsgroup, gauging its efficacy. The study delves into the myriad challenges inherent to text classification. These challenges are thoughtfully discussed, shedding light on the intricacies of the process. Furthermore, the research furnishes comprehensive test outcomes for both conventional machine learning and deep learning models. The significance of this proposed model is that it uses a multiword expression lexicon, wordnet synset, and word embedding techniques for feature extraction. The performance of the models is increased when using these feature extraction techniques.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11042-024-18483-7</doi><tpages>19</tpages><orcidid>https://orcid.org/0000-0002-1979-6809</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1573-7721 |
ispartof | Multimedia tools and applications, 2024-03, Vol.83 (26), p.68087-68105 |
issn | 1573-7721 1380-7501 1573-7721 |
language | eng |
recordid | cdi_proquest_journals_3083014300 |
source | SpringerLink Journals - AutoHoldings |
subjects | Algorithms Artificial neural networks Computer Communication Networks Computer Science Data Structures and Information Theory Deep learning Embedding Feature extraction Machine learning Memory tasks Multimedia Information Systems Natural language processing Special Purpose and Application-Based Systems Unstructured data Words (language) |
title | Amplifying document categorization with advanced features and deep learning |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-04T18%3A29%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Amplifying%20document%20categorization%20with%20advanced%20features%20and%20deep%20learning&rft.jtitle=Multimedia%20tools%20and%20applications&rft.au=Kavitha,%20M.&rft.date=2024-03-04&rft.volume=83&rft.issue=26&rft.spage=68087&rft.epage=68105&rft.pages=68087-68105&rft.issn=1573-7721&rft.eissn=1573-7721&rft_id=info:doi/10.1007/s11042-024-18483-7&rft_dat=%3Cproquest_cross%3E3083014300%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3083014300&rft_id=info:pmid/&rfr_iscdi=true |