Machine Learning Based Classification for Spam Detection

Electronic Electronic messages, i.e. e-mails, are a communication tool frequently used by individuals or organizations. While e-mail is extremely practical to use, it is necessary to consider its vulnerabilities. Spam e-mails are unsolicited messages created to promote a product or service, often se...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Sakarya Üniversitesi Fen Bilimleri Enstitüsü Dergisi 2024-04, Vol.28 (2), p.270-282
Hauptverfasser: Keskin, Serkan, Sevli, Onur
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 282
container_issue 2
container_start_page 270
container_title Sakarya Üniversitesi Fen Bilimleri Enstitüsü Dergisi
container_volume 28
creator Keskin, Serkan
Sevli, Onur
description Electronic Electronic messages, i.e. e-mails, are a communication tool frequently used by individuals or organizations. While e-mail is extremely practical to use, it is necessary to consider its vulnerabilities. Spam e-mails are unsolicited messages created to promote a product or service, often sent frequently. It is very important to classify incoming e-mails in order to protect against malware that can be transmitted via e-mail and to reduce possible unwanted consequences. Spam email classification is the process of identifying and distinguishing spam emails from legitimate emails. This classification can be done through various methods such as keyword filtering, machine learning algorithms and image recognition. The goal of spam email classification is to prevent unwanted and potentially harmful emails from reaching the user's inbox. In this study, Random Forest (RF), Logistic Regression (LR), Naive Bayes (NB), Support Vector Machine (SVM) and Artificial Neural Network (ANN) algorithms are used to classify spam emails and the results are compared. Algorithms with different approaches were used to determine the best solution for the problem. 5558 spam and non-spam e-mails were analyzed and the performance of the algorithms was reported in terms of accuracy, precision, sensitivity and F1-Score metrics. The most successful result was obtained with the RF algorithm with an accuracy of 98.83%. In this study, high success was achieved by classifying spam emails with machine learning algorithms. In addition, it has been proved by experimental studies that better results are obtained than similar studies in the literature.
doi_str_mv 10.16984/saufenbilder.1264476
format Article
fullrecord <record><control><sourceid>crossref</sourceid><recordid>TN_cdi_crossref_primary_10_16984_saufenbilder_1264476</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_16984_saufenbilder_1264476</sourcerecordid><originalsourceid>FETCH-LOGICAL-c936-79e770302d554cbe71becc18f440095c5993c67d0cdea3d9b547ec1ef78d4a5d3</originalsourceid><addsrcrecordid>eNpN0L1OwzAUBWALgURV-ghIfoEUO76O4xHCrxTEQAe2yLm-BqPUqewy8PYI6NDpHJ3hDB9jl1KsZWNbuCruK1Aa4-Qpr2XdAJjmhC1qCaZqlX47PernbFXKpxBCKqjB2AVrnx1-xES8J5dTTO_8xhXyvJtcKTFEdPs4Jx7mzF93bstvaU_4O12ws-CmQqtDLtnm_m7TPVb9y8NTd91XaFVTGUvGCCVqrzXgSEaOhCjbACCE1aitVdgYL9CTU96OGgyhpGBaD057tWT6_xbzXEqmMOxy3Lr8PUgx_AEMxwDDAUD9AKUDUo8</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Machine Learning Based Classification for Spam Detection</title><source>EBSCOhost Business Source Complete</source><creator>Keskin, Serkan ; Sevli, Onur</creator><creatorcontrib>Keskin, Serkan ; Sevli, Onur</creatorcontrib><description>Electronic Electronic messages, i.e. e-mails, are a communication tool frequently used by individuals or organizations. While e-mail is extremely practical to use, it is necessary to consider its vulnerabilities. Spam e-mails are unsolicited messages created to promote a product or service, often sent frequently. It is very important to classify incoming e-mails in order to protect against malware that can be transmitted via e-mail and to reduce possible unwanted consequences. Spam email classification is the process of identifying and distinguishing spam emails from legitimate emails. This classification can be done through various methods such as keyword filtering, machine learning algorithms and image recognition. The goal of spam email classification is to prevent unwanted and potentially harmful emails from reaching the user's inbox. In this study, Random Forest (RF), Logistic Regression (LR), Naive Bayes (NB), Support Vector Machine (SVM) and Artificial Neural Network (ANN) algorithms are used to classify spam emails and the results are compared. Algorithms with different approaches were used to determine the best solution for the problem. 5558 spam and non-spam e-mails were analyzed and the performance of the algorithms was reported in terms of accuracy, precision, sensitivity and F1-Score metrics. The most successful result was obtained with the RF algorithm with an accuracy of 98.83%. In this study, high success was achieved by classifying spam emails with machine learning algorithms. In addition, it has been proved by experimental studies that better results are obtained than similar studies in the literature.</description><identifier>ISSN: 2147-835X</identifier><identifier>EISSN: 2147-835X</identifier><identifier>DOI: 10.16984/saufenbilder.1264476</identifier><language>eng</language><ispartof>Sakarya Üniversitesi Fen Bilimleri Enstitüsü Dergisi, 2024-04, Vol.28 (2), p.270-282</ispartof><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c936-79e770302d554cbe71becc18f440095c5993c67d0cdea3d9b547ec1ef78d4a5d3</cites><orcidid>0000-0002-8933-8395 ; 0000-0001-9404-5039</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Keskin, Serkan</creatorcontrib><creatorcontrib>Sevli, Onur</creatorcontrib><title>Machine Learning Based Classification for Spam Detection</title><title>Sakarya Üniversitesi Fen Bilimleri Enstitüsü Dergisi</title><description>Electronic Electronic messages, i.e. e-mails, are a communication tool frequently used by individuals or organizations. While e-mail is extremely practical to use, it is necessary to consider its vulnerabilities. Spam e-mails are unsolicited messages created to promote a product or service, often sent frequently. It is very important to classify incoming e-mails in order to protect against malware that can be transmitted via e-mail and to reduce possible unwanted consequences. Spam email classification is the process of identifying and distinguishing spam emails from legitimate emails. This classification can be done through various methods such as keyword filtering, machine learning algorithms and image recognition. The goal of spam email classification is to prevent unwanted and potentially harmful emails from reaching the user's inbox. In this study, Random Forest (RF), Logistic Regression (LR), Naive Bayes (NB), Support Vector Machine (SVM) and Artificial Neural Network (ANN) algorithms are used to classify spam emails and the results are compared. Algorithms with different approaches were used to determine the best solution for the problem. 5558 spam and non-spam e-mails were analyzed and the performance of the algorithms was reported in terms of accuracy, precision, sensitivity and F1-Score metrics. The most successful result was obtained with the RF algorithm with an accuracy of 98.83%. In this study, high success was achieved by classifying spam emails with machine learning algorithms. In addition, it has been proved by experimental studies that better results are obtained than similar studies in the literature.</description><issn>2147-835X</issn><issn>2147-835X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpN0L1OwzAUBWALgURV-ghIfoEUO76O4xHCrxTEQAe2yLm-BqPUqewy8PYI6NDpHJ3hDB9jl1KsZWNbuCruK1Aa4-Qpr2XdAJjmhC1qCaZqlX47PernbFXKpxBCKqjB2AVrnx1-xES8J5dTTO_8xhXyvJtcKTFEdPs4Jx7mzF93bstvaU_4O12ws-CmQqtDLtnm_m7TPVb9y8NTd91XaFVTGUvGCCVqrzXgSEaOhCjbACCE1aitVdgYL9CTU96OGgyhpGBaD057tWT6_xbzXEqmMOxy3Lr8PUgx_AEMxwDDAUD9AKUDUo8</recordid><startdate>20240430</startdate><enddate>20240430</enddate><creator>Keskin, Serkan</creator><creator>Sevli, Onur</creator><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-8933-8395</orcidid><orcidid>https://orcid.org/0000-0001-9404-5039</orcidid></search><sort><creationdate>20240430</creationdate><title>Machine Learning Based Classification for Spam Detection</title><author>Keskin, Serkan ; Sevli, Onur</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c936-79e770302d554cbe71becc18f440095c5993c67d0cdea3d9b547ec1ef78d4a5d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Keskin, Serkan</creatorcontrib><creatorcontrib>Sevli, Onur</creatorcontrib><collection>CrossRef</collection><jtitle>Sakarya Üniversitesi Fen Bilimleri Enstitüsü Dergisi</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Keskin, Serkan</au><au>Sevli, Onur</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Machine Learning Based Classification for Spam Detection</atitle><jtitle>Sakarya Üniversitesi Fen Bilimleri Enstitüsü Dergisi</jtitle><date>2024-04-30</date><risdate>2024</risdate><volume>28</volume><issue>2</issue><spage>270</spage><epage>282</epage><pages>270-282</pages><issn>2147-835X</issn><eissn>2147-835X</eissn><abstract>Electronic Electronic messages, i.e. e-mails, are a communication tool frequently used by individuals or organizations. While e-mail is extremely practical to use, it is necessary to consider its vulnerabilities. Spam e-mails are unsolicited messages created to promote a product or service, often sent frequently. It is very important to classify incoming e-mails in order to protect against malware that can be transmitted via e-mail and to reduce possible unwanted consequences. Spam email classification is the process of identifying and distinguishing spam emails from legitimate emails. This classification can be done through various methods such as keyword filtering, machine learning algorithms and image recognition. The goal of spam email classification is to prevent unwanted and potentially harmful emails from reaching the user's inbox. In this study, Random Forest (RF), Logistic Regression (LR), Naive Bayes (NB), Support Vector Machine (SVM) and Artificial Neural Network (ANN) algorithms are used to classify spam emails and the results are compared. Algorithms with different approaches were used to determine the best solution for the problem. 5558 spam and non-spam e-mails were analyzed and the performance of the algorithms was reported in terms of accuracy, precision, sensitivity and F1-Score metrics. The most successful result was obtained with the RF algorithm with an accuracy of 98.83%. In this study, high success was achieved by classifying spam emails with machine learning algorithms. In addition, it has been proved by experimental studies that better results are obtained than similar studies in the literature.</abstract><doi>10.16984/saufenbilder.1264476</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-8933-8395</orcidid><orcidid>https://orcid.org/0000-0001-9404-5039</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2147-835X
ispartof Sakarya Üniversitesi Fen Bilimleri Enstitüsü Dergisi, 2024-04, Vol.28 (2), p.270-282
issn 2147-835X
2147-835X
language eng
recordid cdi_crossref_primary_10_16984_saufenbilder_1264476
source EBSCOhost Business Source Complete
title Machine Learning Based Classification for Spam Detection
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-20T10%3A36%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Machine%20Learning%20Based%20Classification%20for%20Spam%20Detection&rft.jtitle=Sakarya%20%C3%9Cniversitesi%20Fen%20Bilimleri%20Enstit%C3%BCs%C3%BC%20Dergisi&rft.au=Keskin,%20Serkan&rft.date=2024-04-30&rft.volume=28&rft.issue=2&rft.spage=270&rft.epage=282&rft.pages=270-282&rft.issn=2147-835X&rft.eissn=2147-835X&rft_id=info:doi/10.16984/saufenbilder.1264476&rft_dat=%3Ccrossref%3E10_16984_saufenbilder_1264476%3C/crossref%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true