Machine Learning Based Classification for Spam Detection

Electronic Electronic messages, i.e. e-mails, are a communication tool frequently used by individuals or organizations. While e-mail is extremely practical to use, it is necessary to consider its vulnerabilities. Spam e-mails are unsolicited messages created to promote a product or service, often se...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Sakarya Üniversitesi Fen Bilimleri Enstitüsü Dergisi 2024-04, Vol.28 (2), p.270-282
Hauptverfasser:	Keskin, Serkan, Sevli, Onur
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	282
container_issue	2
container_start_page	270
container_title	Sakarya Üniversitesi Fen Bilimleri Enstitüsü Dergisi
container_volume	28
creator	Keskin, Serkan Sevli, Onur
description	Electronic Electronic messages, i.e. e-mails, are a communication tool frequently used by individuals or organizations. While e-mail is extremely practical to use, it is necessary to consider its vulnerabilities. Spam e-mails are unsolicited messages created to promote a product or service, often sent frequently. It is very important to classify incoming e-mails in order to protect against malware that can be transmitted via e-mail and to reduce possible unwanted consequences. Spam email classification is the process of identifying and distinguishing spam emails from legitimate emails. This classification can be done through various methods such as keyword filtering, machine learning algorithms and image recognition. The goal of spam email classification is to prevent unwanted and potentially harmful emails from reaching the user's inbox. In this study, Random Forest (RF), Logistic Regression (LR), Naive Bayes (NB), Support Vector Machine (SVM) and Artificial Neural Network (ANN) algorithms are used to classify spam emails and the results are compared. Algorithms with different approaches were used to determine the best solution for the problem. 5558 spam and non-spam e-mails were analyzed and the performance of the algorithms was reported in terms of accuracy, precision, sensitivity and F1-Score metrics. The most successful result was obtained with the RF algorithm with an accuracy of 98.83%. In this study, high success was achieved by classifying spam emails with machine learning algorithms. In addition, it has been proved by experimental studies that better results are obtained than similar studies in the literature.
doi_str_mv	10.16984/saufenbilder.1264476
format	Article
fullrecord	<record><control><sourceid>crossref</sourceid><recordid>TN_cdi_crossref_primary_10_16984_saufenbilder_1264476</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_16984_saufenbilder_1264476</sourcerecordid><originalsourceid>FETCH-LOGICAL-c936-79e770302d554cbe71becc18f440095c5993c67d0cdea3d9b547ec1ef78d4a5d3</originalsourceid><addsrcrecordid>eNpN0L1OwzAUBWALgURV-ghIfoEUO76O4xHCrxTEQAe2yLm-BqPUqewy8PYI6NDpHJ3hDB9jl1KsZWNbuCruK1Aa4-Qpr2XdAJjmhC1qCaZqlX47PernbFXKpxBCKqjB2AVrnx1-xES8J5dTTO_8xhXyvJtcKTFEdPs4Jx7mzF93bstvaU_4O12ws-CmQqtDLtnm_m7TPVb9y8NTd91XaFVTGUvGCCVqrzXgSEaOhCjbACCE1aitVdgYL9CTU96OGgyhpGBaD057tWT6_xbzXEqmMOxy3Lr8PUgx_AEMxwDDAUD9AKUDUo8</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Machine Learning Based Classification for Spam Detection</title><source>EBSCOhost Business Source Complete</source><creator>Keskin, Serkan ; Sevli, Onur</creator><creatorcontrib>Keskin, Serkan ; Sevli, Onur</creatorcontrib><description>Electronic Electronic messages, i.e. e-mails, are a communication tool frequently used by individuals or organizations. While e-mail is extremely practical to use, it is necessary to consider its vulnerabilities. Spam e-mails are unsolicited messages created to promote a product or service, often sent frequently. It is very important to classify incoming e-mails in order to protect against malware that can be transmitted via e-mail and to reduce possible unwanted consequences. Spam email classification is the process of identifying and distinguishing spam emails from legitimate emails. This classification can be done through various methods such as keyword filtering, machine learning algorithms and image recognition. The goal of spam email classification is to prevent unwanted and potentially harmful emails from reaching the user's inbox. In this study, Random Forest (RF), Logistic Regression (LR), Naive Bayes (NB), Support Vector Machine (SVM) and Artificial Neural Network (ANN) algorithms are used to classify spam emails and the results are compared. Algorithms with different approaches were used to determine the best solution for the problem. 5558 spam and non-spam e-mails were analyzed and the performance of the algorithms was reported in terms of accuracy, precision, sensitivity and F1-Score metrics. The most successful result was obtained with the RF algorithm with an accuracy of 98.83%. In this study, high success was achieved by classifying spam emails with machine learning algorithms. In addition, it has been proved by experimental studies that better results are obtained than similar studies in the literature.</description><identifier>ISSN: 2147-835X</identifier><identifier>EISSN: 2147-835X</identifier><identifier>DOI: 10.16984/saufenbilder.1264476</identifier><language>eng</language><ispartof>Sakarya Üniversitesi Fen Bilimleri Enstitüsü Dergisi, 2024-04, Vol.28 (2), p.270-282</ispartof><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c936-79e770302d554cbe71becc18f440095c5993c67d0cdea3d9b547ec1ef78d4a5d3</cites><orcidid>0000-0002-8933-8395 ; 0000-0001-9404-5039</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Keskin, Serkan</creatorcontrib><creatorcontrib>Sevli, Onur</creatorcontrib><title>Machine Learning Based Classification for Spam Detection</title><title>Sakarya Üniversitesi Fen Bilimleri Enstitüsü Dergisi</title><description>Electronic Electronic messages, i.e. e-mails, are a communication tool frequently used by individuals or organizations. While e-mail is extremely practical to use, it is necessary to consider its vulnerabilities. Spam e-mails are unsolicited messages created to promote a product or service, often sent frequently. It is very important to classify incoming e-mails in order to protect against malware that can be transmitted via e-mail and to reduce possible unwanted consequences. Spam email classification is the process of identifying and distinguishing spam emails from legitimate emails. This classification can be done through various methods such as keyword filtering, machine learning algorithms and image recognition. The goal of spam email classification is to prevent unwanted and potentially harmful emails from reaching the user's inbox. In this study, Random Forest (RF), Logistic Regression (LR), Naive Bayes (NB), Support Vector Machine (SVM) and Artificial Neural Network (ANN) algorithms are used to classify spam emails and the results are compared. Algorithms with different approaches were used to determine the best solution for the problem. 5558 spam and non-spam e-mails were analyzed and the performance of the algorithms was reported in terms of accuracy, precision, sensitivity and F1-Score metrics. The most successful result was obtained with the RF algorithm with an accuracy of 98.83%. In this study, high success was achieved by classifying spam emails with machine learning algorithms. In addition, it has been proved by experimental studies that better results are obtained than similar studies in the literature.</description><issn>2147-835X</issn><issn>2147-835X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpN0L1OwzAUBWALgURV-ghIfoEUO76O4xHCrxTEQAe2yLm-BqPUqewy8PYI6NDpHJ3hDB9jl1KsZWNbuCruK1Aa4-Qpr2XdAJjmhC1qCaZqlX47PernbFXKpxBCKqjB2AVrnx1-xES8J5dTTO_8xhXyvJtcKTFEdPs4Jx7mzF93bstvaU_4O12ws-CmQqtDLtnm_m7TPVb9y8NTd91XaFVTGUvGCCVqrzXgSEaOhCjbACCE1aitVdgYL9CTU96OGgyhpGBaD057tWT6_xbzXEqmMOxy3Lr8PUgx_AEMxwDDAUD9AKUDUo8</recordid><startdate>20240430</startdate><enddate>20240430</enddate><creator>Keskin, Serkan</creator><creator>Sevli, Onur</creator><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-8933-8395</orcidid><orcidid>https://orcid.org/0000-0001-9404-5039</orcidid></search><sort><creationdate>20240430</creationdate><title>Machine Learning Based Classification for Spam Detection</title><author>Keskin, Serkan ; Sevli, Onur</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c936-79e770302d554cbe71becc18f440095c5993c67d0cdea3d9b547ec1ef78d4a5d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Keskin, Serkan</creatorcontrib><creatorcontrib>Sevli, Onur</creatorcontrib><collection>CrossRef</collection><jtitle>Sakarya Üniversitesi Fen Bilimleri Enstitüsü Dergisi</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Keskin, Serkan</au><au>Sevli, Onur</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Machine Learning Based Classification for Spam Detection</atitle><jtitle>Sakarya Üniversitesi Fen Bilimleri Enstitüsü Dergisi</jtitle><date>2024-04-30</date><risdate>2024</risdate><volume>28</volume><issue>2</issue><spage>270</spage><epage>282</epage><pages>270-282</pages><issn>2147-835X</issn><eissn>2147-835X</eissn><abstract>Electronic Electronic messages, i.e. e-mails, are a communication tool frequently used by individuals or organizations. While e-mail is extremely practical to use, it is necessary to consider its vulnerabilities. Spam e-mails are unsolicited messages created to promote a product or service, often sent frequently. It is very important to classify incoming e-mails in order to protect against malware that can be transmitted via e-mail and to reduce possible unwanted consequences. Spam email classification is the process of identifying and distinguishing spam emails from legitimate emails. This classification can be done through various methods such as keyword filtering, machine learning algorithms and image recognition. The goal of spam email classification is to prevent unwanted and potentially harmful emails from reaching the user's inbox. In this study, Random Forest (RF), Logistic Regression (LR), Naive Bayes (NB), Support Vector Machine (SVM) and Artificial Neural Network (ANN) algorithms are used to classify spam emails and the results are compared. Algorithms with different approaches were used to determine the best solution for the problem. 5558 spam and non-spam e-mails were analyzed and the performance of the algorithms was reported in terms of accuracy, precision, sensitivity and F1-Score metrics. The most successful result was obtained with the RF algorithm with an accuracy of 98.83%. In this study, high success was achieved by classifying spam emails with machine learning algorithms. In addition, it has been proved by experimental studies that better results are obtained than similar studies in the literature.</abstract><doi>10.16984/saufenbilder.1264476</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-8933-8395</orcidid><orcidid>https://orcid.org/0000-0001-9404-5039</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2147-835X
ispartof	Sakarya Üniversitesi Fen Bilimleri Enstitüsü Dergisi, 2024-04, Vol.28 (2), p.270-282
issn	2147-835X 2147-835X
language	eng
recordid	cdi_crossref_primary_10_16984_saufenbilder_1264476
source	EBSCOhost Business Source Complete
title	Machine Learning Based Classification for Spam Detection
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-20T10%3A36%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Machine%20Learning%20Based%20Classification%20for%20Spam%20Detection&rft.jtitle=Sakarya%20%C3%9Cniversitesi%20Fen%20Bilimleri%20Enstit%C3%BCs%C3%BC%20Dergisi&rft.au=Keskin,%20Serkan&rft.date=2024-04-30&rft.volume=28&rft.issue=2&rft.spage=270&rft.epage=282&rft.pages=270-282&rft.issn=2147-835X&rft.eissn=2147-835X&rft_id=info:doi/10.16984/saufenbilder.1264476&rft_dat=%3Ccrossref%3E10_16984_saufenbilder_1264476%3C/crossref%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true