Email spam detection and filtering using machine learning

Phishing assaults, in which the perpetrator masquerades as a legitimate source in order to obtain confidential material, are now a serious threat due to the rapid growth of online consumers damaging one’s credibility, costing one’s money, or infecting one’s computer with spyware and perhaps other vi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Asha, P., Siddhartha, Katakam, Manikanta, Kodati Naga Satya Sai, Gopi, Chilukuri, Mayan, J. Albert
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 1
container_start_page
container_title
container_volume 3075
creator Asha, P.
Siddhartha, Katakam
Manikanta, Kodati Naga Satya Sai
Gopi, Chilukuri
Mayan, J. Albert
description Phishing assaults, in which the perpetrator masquerades as a legitimate source in order to obtain confidential material, are now a serious threat due to the rapid growth of online consumers damaging one’s credibility, costing one’s money, or infecting one’s computer with spyware and perhaps other viruses. Due to their capacity to sift through large amounts of data in search of patterns that can be used to make predictions, intelligent approaches like ML & DL were finding growing usage in the realm of cybersecurity. In this study, we explore the efficacy of using such clever methods to identify phishing websites. We utilized two different data sets and picked the most highly linked attributes, which included both content-based and URL-lexical/domain-based characteristics. After that, many ML models were implemented, and their relative efficacy was assessed. The results demonstrated the significance of selecting features in raising the quality of the models. In addition, the findings attempted to determine the most useful factors that affect the model when it comes to recognizing phishing websites. When it came to classifying data, the Random Forest (RF) algorithm performed best across the board.
doi_str_mv 10.1063/5.0217574
format Conference Proceeding
fullrecord <record><control><sourceid>proquest_scita</sourceid><recordid>TN_cdi_scitation_primary_10_1063_5_0217574</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3085724141</sourcerecordid><originalsourceid>FETCH-LOGICAL-p634-c44c4022464c2df923f61df071faff396d85df7681bc5ffa02b27bd897f307853</originalsourceid><addsrcrecordid>eNotUEtLAzEYDKLgWj34DwLehK1f3tmjlFqFgpcevIVsHpqyLze7B_-9W9rLDAzDDDMIPRJYE5DsRayBEiUUv0IFEYKUShJ5jQqAipeUs69bdJfzEYBWSukCVdvWpgbnwbbYhym4KfUdtp3HMTVTGFP3jed8wta6n9QF3AQ7dotwj26ibXJ4uPAKHd62h817uf_cfWxe9-UgGS8d544DpVxyR32sKIuS-AiKRBsjq6TXwkclNamdiNECramqva5UZKC0YCv0dI4dxv53Dnkyx34eu6XRMNBCUU44WVzPZ1d2abKnDWYYU2vHP0PAnJ4xwlyeYf9yN1Q_</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype><pqid>3085724141</pqid></control><display><type>conference_proceeding</type><title>Email spam detection and filtering using machine learning</title><source>AIP Journals Complete</source><creator>Asha, P. ; Siddhartha, Katakam ; Manikanta, Kodati Naga Satya Sai ; Gopi, Chilukuri ; Mayan, J. Albert</creator><contributor>Godfrey Winster, S ; Pushpalatha, M ; Baskar, M ; Kishore Anthuvan Sahayaraj, K</contributor><creatorcontrib>Asha, P. ; Siddhartha, Katakam ; Manikanta, Kodati Naga Satya Sai ; Gopi, Chilukuri ; Mayan, J. Albert ; Godfrey Winster, S ; Pushpalatha, M ; Baskar, M ; Kishore Anthuvan Sahayaraj, K</creatorcontrib><description>Phishing assaults, in which the perpetrator masquerades as a legitimate source in order to obtain confidential material, are now a serious threat due to the rapid growth of online consumers damaging one’s credibility, costing one’s money, or infecting one’s computer with spyware and perhaps other viruses. Due to their capacity to sift through large amounts of data in search of patterns that can be used to make predictions, intelligent approaches like ML &amp; DL were finding growing usage in the realm of cybersecurity. In this study, we explore the efficacy of using such clever methods to identify phishing websites. We utilized two different data sets and picked the most highly linked attributes, which included both content-based and URL-lexical/domain-based characteristics. After that, many ML models were implemented, and their relative efficacy was assessed. The results demonstrated the significance of selecting features in raising the quality of the models. In addition, the findings attempted to determine the most useful factors that affect the model when it comes to recognizing phishing websites. When it came to classifying data, the Random Forest (RF) algorithm performed best across the board.</description><identifier>ISSN: 0094-243X</identifier><identifier>EISSN: 1551-7616</identifier><identifier>DOI: 10.1063/5.0217574</identifier><identifier>CODEN: APCPCS</identifier><language>eng</language><publisher>Melville: American Institute of Physics</publisher><subject>Algorithms ; Cybercrime ; Cybersecurity ; Damage detection ; Effectiveness ; Feature recognition ; Identification methods ; Machine learning ; Phishing ; Websites</subject><ispartof>AIP conference proceedings, 2024, Vol.3075 (1)</ispartof><rights>Author(s)</rights><rights>2024 Author(s). Published under an exclusive license by AIP Publishing.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://pubs.aip.org/acp/article-lookup/doi/10.1063/5.0217574$$EHTML$$P50$$Gscitation$$H</linktohtml><link.rule.ids>309,310,314,776,780,785,786,790,4498,23909,23910,25118,27901,27902,76126</link.rule.ids></links><search><contributor>Godfrey Winster, S</contributor><contributor>Pushpalatha, M</contributor><contributor>Baskar, M</contributor><contributor>Kishore Anthuvan Sahayaraj, K</contributor><creatorcontrib>Asha, P.</creatorcontrib><creatorcontrib>Siddhartha, Katakam</creatorcontrib><creatorcontrib>Manikanta, Kodati Naga Satya Sai</creatorcontrib><creatorcontrib>Gopi, Chilukuri</creatorcontrib><creatorcontrib>Mayan, J. Albert</creatorcontrib><title>Email spam detection and filtering using machine learning</title><title>AIP conference proceedings</title><description>Phishing assaults, in which the perpetrator masquerades as a legitimate source in order to obtain confidential material, are now a serious threat due to the rapid growth of online consumers damaging one’s credibility, costing one’s money, or infecting one’s computer with spyware and perhaps other viruses. Due to their capacity to sift through large amounts of data in search of patterns that can be used to make predictions, intelligent approaches like ML &amp; DL were finding growing usage in the realm of cybersecurity. In this study, we explore the efficacy of using such clever methods to identify phishing websites. We utilized two different data sets and picked the most highly linked attributes, which included both content-based and URL-lexical/domain-based characteristics. After that, many ML models were implemented, and their relative efficacy was assessed. The results demonstrated the significance of selecting features in raising the quality of the models. In addition, the findings attempted to determine the most useful factors that affect the model when it comes to recognizing phishing websites. When it came to classifying data, the Random Forest (RF) algorithm performed best across the board.</description><subject>Algorithms</subject><subject>Cybercrime</subject><subject>Cybersecurity</subject><subject>Damage detection</subject><subject>Effectiveness</subject><subject>Feature recognition</subject><subject>Identification methods</subject><subject>Machine learning</subject><subject>Phishing</subject><subject>Websites</subject><issn>0094-243X</issn><issn>1551-7616</issn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2024</creationdate><recordtype>conference_proceeding</recordtype><recordid>eNotUEtLAzEYDKLgWj34DwLehK1f3tmjlFqFgpcevIVsHpqyLze7B_-9W9rLDAzDDDMIPRJYE5DsRayBEiUUv0IFEYKUShJ5jQqAipeUs69bdJfzEYBWSukCVdvWpgbnwbbYhym4KfUdtp3HMTVTGFP3jed8wta6n9QF3AQ7dotwj26ibXJ4uPAKHd62h817uf_cfWxe9-UgGS8d544DpVxyR32sKIuS-AiKRBsjq6TXwkclNamdiNECramqva5UZKC0YCv0dI4dxv53Dnkyx34eu6XRMNBCUU44WVzPZ1d2abKnDWYYU2vHP0PAnJ4xwlyeYf9yN1Q_</recordid><startdate>20240729</startdate><enddate>20240729</enddate><creator>Asha, P.</creator><creator>Siddhartha, Katakam</creator><creator>Manikanta, Kodati Naga Satya Sai</creator><creator>Gopi, Chilukuri</creator><creator>Mayan, J. Albert</creator><general>American Institute of Physics</general><scope>8FD</scope><scope>H8D</scope><scope>L7M</scope></search><sort><creationdate>20240729</creationdate><title>Email spam detection and filtering using machine learning</title><author>Asha, P. ; Siddhartha, Katakam ; Manikanta, Kodati Naga Satya Sai ; Gopi, Chilukuri ; Mayan, J. Albert</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p634-c44c4022464c2df923f61df071faff396d85df7681bc5ffa02b27bd897f307853</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Cybercrime</topic><topic>Cybersecurity</topic><topic>Damage detection</topic><topic>Effectiveness</topic><topic>Feature recognition</topic><topic>Identification methods</topic><topic>Machine learning</topic><topic>Phishing</topic><topic>Websites</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Asha, P.</creatorcontrib><creatorcontrib>Siddhartha, Katakam</creatorcontrib><creatorcontrib>Manikanta, Kodati Naga Satya Sai</creatorcontrib><creatorcontrib>Gopi, Chilukuri</creatorcontrib><creatorcontrib>Mayan, J. Albert</creatorcontrib><collection>Technology Research Database</collection><collection>Aerospace Database</collection><collection>Advanced Technologies Database with Aerospace</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Asha, P.</au><au>Siddhartha, Katakam</au><au>Manikanta, Kodati Naga Satya Sai</au><au>Gopi, Chilukuri</au><au>Mayan, J. Albert</au><au>Godfrey Winster, S</au><au>Pushpalatha, M</au><au>Baskar, M</au><au>Kishore Anthuvan Sahayaraj, K</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Email spam detection and filtering using machine learning</atitle><btitle>AIP conference proceedings</btitle><date>2024-07-29</date><risdate>2024</risdate><volume>3075</volume><issue>1</issue><issn>0094-243X</issn><eissn>1551-7616</eissn><coden>APCPCS</coden><abstract>Phishing assaults, in which the perpetrator masquerades as a legitimate source in order to obtain confidential material, are now a serious threat due to the rapid growth of online consumers damaging one’s credibility, costing one’s money, or infecting one’s computer with spyware and perhaps other viruses. Due to their capacity to sift through large amounts of data in search of patterns that can be used to make predictions, intelligent approaches like ML &amp; DL were finding growing usage in the realm of cybersecurity. In this study, we explore the efficacy of using such clever methods to identify phishing websites. We utilized two different data sets and picked the most highly linked attributes, which included both content-based and URL-lexical/domain-based characteristics. After that, many ML models were implemented, and their relative efficacy was assessed. The results demonstrated the significance of selecting features in raising the quality of the models. In addition, the findings attempted to determine the most useful factors that affect the model when it comes to recognizing phishing websites. When it came to classifying data, the Random Forest (RF) algorithm performed best across the board.</abstract><cop>Melville</cop><pub>American Institute of Physics</pub><doi>10.1063/5.0217574</doi><tpages>11</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0094-243X
ispartof AIP conference proceedings, 2024, Vol.3075 (1)
issn 0094-243X
1551-7616
language eng
recordid cdi_scitation_primary_10_1063_5_0217574
source AIP Journals Complete
subjects Algorithms
Cybercrime
Cybersecurity
Damage detection
Effectiveness
Feature recognition
Identification methods
Machine learning
Phishing
Websites
title Email spam detection and filtering using machine learning
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-06T08%3A51%3A03IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_scita&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Email%20spam%20detection%20and%20filtering%20using%20machine%20learning&rft.btitle=AIP%20conference%20proceedings&rft.au=Asha,%20P.&rft.date=2024-07-29&rft.volume=3075&rft.issue=1&rft.issn=0094-243X&rft.eissn=1551-7616&rft.coden=APCPCS&rft_id=info:doi/10.1063/5.0217574&rft_dat=%3Cproquest_scita%3E3085724141%3C/proquest_scita%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3085724141&rft_id=info:pmid/&rfr_iscdi=true