Apache Spark and Deep Learning Models for High-Performance Network Intrusion Detection Using CSE-CIC-IDS2018

Keeping computers secure is becoming challenging as networks grow and new network-based technologies emerge. Cybercriminals’ attack surface expands with the release of new internet-enabled products. As many cyberattacks affect businesses’ confidentiality, availability, and integrity, network intrusi...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Computational intelligence and neuroscience 2022-08, Vol.2022, p.3131153-11
Hauptverfasser:	Hagar, Abdulnaser A., Gawali, Bharti W.
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Algorithms Artificial neural networks Availability Big Data Business metrics Classification Computer networks Computers Cyberterrorism Datasets Deep Learning Detectors Experiments Intrusion detection systems Literature reviews Long short-term memory Machine learning Neural networks Neural Networks, Computer Prediction models
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	11
container_issue
container_start_page	3131153
container_title	Computational intelligence and neuroscience
container_volume	2022
creator	Hagar, Abdulnaser A. Gawali, Bharti W.
description	Keeping computers secure is becoming challenging as networks grow and new network-based technologies emerge. Cybercriminals’ attack surface expands with the release of new internet-enabled products. As many cyberattacks affect businesses’ confidentiality, availability, and integrity, network intrusion detection systems (NIDS) show an essential role. Network-based intrusion detection uses datasets like CSE-CIC-IDS2018 to train prediction models. With fourteen types of attacks included, the latest big data set for intrusion detection is available to the public. This work proposes three models, two deep learning convolutional neural networks (CNN), long short-term memory (LSTM), and Apache Spark, to improve the detection of all types of attacks. To reduce the dimensionality, random forests (RF) was employed to select the important features; it gave 19 from 84 features. The dataset is imbalanced; thus, oversampling and undersampling techniques reduce the imbalance ratio. The Apache Spark model produced the best results across all 15 classes, with accuracy as high as 100% for all classes, as seen by the experiments’ findings. For the F1-score, Apache Spark showed the highest results with 1.00 for most classes. The findings of the three models showed outstanding results for multiclassification network intrusion detection.
doi_str_mv	10.1155/2022/3131153
format	Article
fullrecord	<record><control><sourceid>gale_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9439899</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A716149063</galeid><sourcerecordid>A716149063</sourcerecordid><originalsourceid>FETCH-LOGICAL-c476t-66b785d0f088276e21cfddd18039f23169735cd0d7282a220b81aec3a94404de3</originalsourceid><addsrcrecordid>eNp9kUtvEzEUhUcIREthxxqNxAYJhvoxtscbpGha2kjhIYWuLce-k7hM7GDPUPHv8TQhPBasfCx_5_henaJ4jtFbjBk7J4iQc4ppvtAHxSnmjagYEfThUXN2UjxJ6RYhJhgij4sTyhGTVLLTop_ttNlAudzp-LXU3pYXALtyATp659flh2ChT2UXYnnt1pvqM8Sst9obKD_CcBeya-6HOCYXfPYOYIZJ3aTJ3S4vq3beVvOLJUG4eVo86nSf4NnhPCtu3l9-aa-rxaereTtbVKYWfKg4X4mGWdShpiGCA8Gms9biBlHZEYq5FJQZi6wgDdGEoFWDNRiqZV2j2gI9K97tc3fjagvWQB5Q92oX3VbHHypop_5-8W6j1uG7kjWVjZQ54NUhIIZvI6RBbV0y0PfaQxiTIgJJiQnlE_ryH_Q2jNHn9e4pJgVuxG9qrXtQznch_2umUDUTmONaIk4z9WZPmRhSitAdR8ZITWWrqWx1KDvjL_5c8wj_ajcDr_fAxnmr79z_434CTBKuFQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2709597187</pqid></control><display><type>article</type><title>Apache Spark and Deep Learning Models for High-Performance Network Intrusion Detection Using CSE-CIC-IDS2018</title><source>MEDLINE</source><source>Wiley Online Library Open Access</source><source>PMC (PubMed Central)</source><source>EZB-FREE-00999 freely available EZB journals</source><source>Alma/SFX Local Collection</source><source>PubMed Central Open Access</source><creator>Hagar, Abdulnaser A. ; Gawali, Bharti W.</creator><contributor>Ijaz, Muhammad Fazal ; Muhammad Fazal Ijaz</contributor><creatorcontrib>Hagar, Abdulnaser A. ; Gawali, Bharti W. ; Ijaz, Muhammad Fazal ; Muhammad Fazal Ijaz</creatorcontrib><description>Keeping computers secure is becoming challenging as networks grow and new network-based technologies emerge. Cybercriminals’ attack surface expands with the release of new internet-enabled products. As many cyberattacks affect businesses’ confidentiality, availability, and integrity, network intrusion detection systems (NIDS) show an essential role. Network-based intrusion detection uses datasets like CSE-CIC-IDS2018 to train prediction models. With fourteen types of attacks included, the latest big data set for intrusion detection is available to the public. This work proposes three models, two deep learning convolutional neural networks (CNN), long short-term memory (LSTM), and Apache Spark, to improve the detection of all types of attacks. To reduce the dimensionality, random forests (RF) was employed to select the important features; it gave 19 from 84 features. The dataset is imbalanced; thus, oversampling and undersampling techniques reduce the imbalance ratio. The Apache Spark model produced the best results across all 15 classes, with accuracy as high as 100% for all classes, as seen by the experiments’ findings. For the F1-score, Apache Spark showed the highest results with 1.00 for most classes. The findings of the three models showed outstanding results for multiclassification network intrusion detection.</description><identifier>ISSN: 1687-5265</identifier><identifier>ISSN: 1687-5273</identifier><identifier>EISSN: 1687-5273</identifier><identifier>DOI: 10.1155/2022/3131153</identifier><identifier>PMID: 36059395</identifier><language>eng</language><publisher>United States: Hindawi</publisher><subject>Accuracy ; Algorithms ; Artificial neural networks ; Availability ; Big Data ; Business metrics ; Classification ; Computer networks ; Computers ; Cyberterrorism ; Datasets ; Deep Learning ; Detectors ; Experiments ; Intrusion detection systems ; Literature reviews ; Long short-term memory ; Machine learning ; Neural networks ; Neural Networks, Computer ; Prediction models</subject><ispartof>Computational intelligence and neuroscience, 2022-08, Vol.2022, p.3131153-11</ispartof><rights>Copyright © 2022 Abdulnaser A. Hagar and Bharti W. Gawali.</rights><rights>COPYRIGHT 2022 John Wiley & Sons, Inc.</rights><rights>Copyright © 2022 Abdulnaser A. Hagar and Bharti W. Gawali. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0</rights><rights>Copyright © 2022 Abdulnaser A. Hagar and Bharti W. Gawali. 2022</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c476t-66b785d0f088276e21cfddd18039f23169735cd0d7282a220b81aec3a94404de3</citedby><cites>FETCH-LOGICAL-c476t-66b785d0f088276e21cfddd18039f23169735cd0d7282a220b81aec3a94404de3</cites><orcidid>0000-0003-3351-0966 ; 0000-0002-8353-5849</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9439899/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9439899/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,27901,27902,53766,53768</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/36059395$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Ijaz, Muhammad Fazal</contributor><contributor>Muhammad Fazal Ijaz</contributor><creatorcontrib>Hagar, Abdulnaser A.</creatorcontrib><creatorcontrib>Gawali, Bharti W.</creatorcontrib><title>Apache Spark and Deep Learning Models for High-Performance Network Intrusion Detection Using CSE-CIC-IDS2018</title><title>Computational intelligence and neuroscience</title><addtitle>Comput Intell Neurosci</addtitle><description>Keeping computers secure is becoming challenging as networks grow and new network-based technologies emerge. Cybercriminals’ attack surface expands with the release of new internet-enabled products. As many cyberattacks affect businesses’ confidentiality, availability, and integrity, network intrusion detection systems (NIDS) show an essential role. Network-based intrusion detection uses datasets like CSE-CIC-IDS2018 to train prediction models. With fourteen types of attacks included, the latest big data set for intrusion detection is available to the public. This work proposes three models, two deep learning convolutional neural networks (CNN), long short-term memory (LSTM), and Apache Spark, to improve the detection of all types of attacks. To reduce the dimensionality, random forests (RF) was employed to select the important features; it gave 19 from 84 features. The dataset is imbalanced; thus, oversampling and undersampling techniques reduce the imbalance ratio. The Apache Spark model produced the best results across all 15 classes, with accuracy as high as 100% for all classes, as seen by the experiments’ findings. For the F1-score, Apache Spark showed the highest results with 1.00 for most classes. The findings of the three models showed outstanding results for multiclassification network intrusion detection.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Artificial neural networks</subject><subject>Availability</subject><subject>Big Data</subject><subject>Business metrics</subject><subject>Classification</subject><subject>Computer networks</subject><subject>Computers</subject><subject>Cyberterrorism</subject><subject>Datasets</subject><subject>Deep Learning</subject><subject>Detectors</subject><subject>Experiments</subject><subject>Intrusion detection systems</subject><subject>Literature reviews</subject><subject>Long short-term memory</subject><subject>Machine learning</subject><subject>Neural networks</subject><subject>Neural Networks, Computer</subject><subject>Prediction models</subject><issn>1687-5265</issn><issn>1687-5273</issn><issn>1687-5273</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RHX</sourceid><sourceid>EIF</sourceid><sourceid>BENPR</sourceid><recordid>eNp9kUtvEzEUhUcIREthxxqNxAYJhvoxtscbpGha2kjhIYWuLce-k7hM7GDPUPHv8TQhPBasfCx_5_henaJ4jtFbjBk7J4iQc4ppvtAHxSnmjagYEfThUXN2UjxJ6RYhJhgij4sTyhGTVLLTop_ttNlAudzp-LXU3pYXALtyATp659flh2ChT2UXYnnt1pvqM8Sst9obKD_CcBeya-6HOCYXfPYOYIZJ3aTJ3S4vq3beVvOLJUG4eVo86nSf4NnhPCtu3l9-aa-rxaereTtbVKYWfKg4X4mGWdShpiGCA8Gms9biBlHZEYq5FJQZi6wgDdGEoFWDNRiqZV2j2gI9K97tc3fjagvWQB5Q92oX3VbHHypop_5-8W6j1uG7kjWVjZQ54NUhIIZvI6RBbV0y0PfaQxiTIgJJiQnlE_ryH_Q2jNHn9e4pJgVuxG9qrXtQznch_2umUDUTmONaIk4z9WZPmRhSitAdR8ZITWWrqWx1KDvjL_5c8wj_ajcDr_fAxnmr79z_434CTBKuFQ</recordid><startdate>20220826</startdate><enddate>20220826</enddate><creator>Hagar, Abdulnaser A.</creator><creator>Gawali, Bharti W.</creator><general>Hindawi</general><general>John Wiley & Sons, Inc</general><general>Hindawi Limited</general><scope>RHU</scope><scope>RHW</scope><scope>RHX</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7QF</scope><scope>7QQ</scope><scope>7SC</scope><scope>7SE</scope><scope>7SP</scope><scope>7SR</scope><scope>7TA</scope><scope>7TB</scope><scope>7TK</scope><scope>7U5</scope><scope>7X7</scope><scope>7XB</scope><scope>8AL</scope><scope>8BQ</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>COVID</scope><scope>CWDGH</scope><scope>DWQXO</scope><scope>F28</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>H8D</scope><scope>H8G</scope><scope>HCIFZ</scope><scope>JG9</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>KR7</scope><scope>L6V</scope><scope>L7M</scope><scope>LK8</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PSYQQ</scope><scope>PTHSS</scope><scope>Q9U</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0003-3351-0966</orcidid><orcidid>https://orcid.org/0000-0002-8353-5849</orcidid></search><sort><creationdate>20220826</creationdate><title>Apache Spark and Deep Learning Models for High-Performance Network Intrusion Detection Using CSE-CIC-IDS2018</title><author>Hagar, Abdulnaser A. ; Gawali, Bharti W.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c476t-66b785d0f088276e21cfddd18039f23169735cd0d7282a220b81aec3a94404de3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Artificial neural networks</topic><topic>Availability</topic><topic>Big Data</topic><topic>Business metrics</topic><topic>Classification</topic><topic>Computer networks</topic><topic>Computers</topic><topic>Cyberterrorism</topic><topic>Datasets</topic><topic>Deep Learning</topic><topic>Detectors</topic><topic>Experiments</topic><topic>Intrusion detection systems</topic><topic>Literature reviews</topic><topic>Long short-term memory</topic><topic>Machine learning</topic><topic>Neural networks</topic><topic>Neural Networks, Computer</topic><topic>Prediction models</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hagar, Abdulnaser A.</creatorcontrib><creatorcontrib>Gawali, Bharti W.</creatorcontrib><collection>Hindawi Publishing Complete</collection><collection>Hindawi Publishing Subscription Journals</collection><collection>Hindawi Publishing Open Access</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Aluminium Industry Abstracts</collection><collection>Ceramic Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Corrosion Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Materials Business File</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Computing Database (Alumni Edition)</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>Coronavirus Research Database</collection><collection>Middle East & Africa Database</collection><collection>ProQuest Central Korea</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>Aerospace Database</collection><collection>Copper Technical Reference Library</collection><collection>SciTech Premium Collection</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Civil Engineering Abstracts</collection><collection>ProQuest Engineering Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>ProQuest Biological Science Collection</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Biological Science Database</collection><collection>Engineering Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest One Psychology</collection><collection>Engineering Collection</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Computational intelligence and neuroscience</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hagar, Abdulnaser A.</au><au>Gawali, Bharti W.</au><au>Ijaz, Muhammad Fazal</au><au>Muhammad Fazal Ijaz</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Apache Spark and Deep Learning Models for High-Performance Network Intrusion Detection Using CSE-CIC-IDS2018</atitle><jtitle>Computational intelligence and neuroscience</jtitle><addtitle>Comput Intell Neurosci</addtitle><date>2022-08-26</date><risdate>2022</risdate><volume>2022</volume><spage>3131153</spage><epage>11</epage><pages>3131153-11</pages><issn>1687-5265</issn><issn>1687-5273</issn><eissn>1687-5273</eissn><abstract>Keeping computers secure is becoming challenging as networks grow and new network-based technologies emerge. Cybercriminals’ attack surface expands with the release of new internet-enabled products. As many cyberattacks affect businesses’ confidentiality, availability, and integrity, network intrusion detection systems (NIDS) show an essential role. Network-based intrusion detection uses datasets like CSE-CIC-IDS2018 to train prediction models. With fourteen types of attacks included, the latest big data set for intrusion detection is available to the public. This work proposes three models, two deep learning convolutional neural networks (CNN), long short-term memory (LSTM), and Apache Spark, to improve the detection of all types of attacks. To reduce the dimensionality, random forests (RF) was employed to select the important features; it gave 19 from 84 features. The dataset is imbalanced; thus, oversampling and undersampling techniques reduce the imbalance ratio. The Apache Spark model produced the best results across all 15 classes, with accuracy as high as 100% for all classes, as seen by the experiments’ findings. For the F1-score, Apache Spark showed the highest results with 1.00 for most classes. The findings of the three models showed outstanding results for multiclassification network intrusion detection.</abstract><cop>United States</cop><pub>Hindawi</pub><pmid>36059395</pmid><doi>10.1155/2022/3131153</doi><tpages>11</tpages><orcidid>https://orcid.org/0000-0003-3351-0966</orcidid><orcidid>https://orcid.org/0000-0002-8353-5849</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1687-5265
ispartof	Computational intelligence and neuroscience, 2022-08, Vol.2022, p.3131153-11
issn	1687-5265 1687-5273 1687-5273
language	eng
recordid	cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9439899
source	MEDLINE; Wiley Online Library Open Access; PMC (PubMed Central); EZB-FREE-00999 freely available EZB journals; Alma/SFX Local Collection; PubMed Central Open Access
subjects	Accuracy Algorithms Artificial neural networks Availability Big Data Business metrics Classification Computer networks Computers Cyberterrorism Datasets Deep Learning Detectors Experiments Intrusion detection systems Literature reviews Long short-term memory Machine learning Neural networks Neural Networks, Computer Prediction models
title	Apache Spark and Deep Learning Models for High-Performance Network Intrusion Detection Using CSE-CIC-IDS2018
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-01T00%3A40%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Apache%20Spark%20and%20Deep%20Learning%20Models%20for%20High-Performance%20Network%20Intrusion%20Detection%20Using%20CSE-CIC-IDS2018&rft.jtitle=Computational%20intelligence%20and%20neuroscience&rft.au=Hagar,%20Abdulnaser%20A.&rft.date=2022-08-26&rft.volume=2022&rft.spage=3131153&rft.epage=11&rft.pages=3131153-11&rft.issn=1687-5265&rft.eissn=1687-5273&rft_id=info:doi/10.1155/2022/3131153&rft_dat=%3Cgale_pubme%3EA716149063%3C/gale_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2709597187&rft_id=info:pmid/36059395&rft_galeid=A716149063&rfr_iscdi=true