Apache Spark and Deep Learning Models for High-Performance Network Intrusion Detection Using CSE-CIC-IDS2018

Keeping computers secure is becoming challenging as networks grow and new network-based technologies emerge. Cybercriminals’ attack surface expands with the release of new internet-enabled products. As many cyberattacks affect businesses’ confidentiality, availability, and integrity, network intrusi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computational intelligence and neuroscience 2022-08, Vol.2022, p.3131153-11
Hauptverfasser: Hagar, Abdulnaser A., Gawali, Bharti W.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 11
container_issue
container_start_page 3131153
container_title Computational intelligence and neuroscience
container_volume 2022
creator Hagar, Abdulnaser A.
Gawali, Bharti W.
description Keeping computers secure is becoming challenging as networks grow and new network-based technologies emerge. Cybercriminals’ attack surface expands with the release of new internet-enabled products. As many cyberattacks affect businesses’ confidentiality, availability, and integrity, network intrusion detection systems (NIDS) show an essential role. Network-based intrusion detection uses datasets like CSE-CIC-IDS2018 to train prediction models. With fourteen types of attacks included, the latest big data set for intrusion detection is available to the public. This work proposes three models, two deep learning convolutional neural networks (CNN), long short-term memory (LSTM), and Apache Spark, to improve the detection of all types of attacks. To reduce the dimensionality, random forests (RF) was employed to select the important features; it gave 19 from 84 features. The dataset is imbalanced; thus, oversampling and undersampling techniques reduce the imbalance ratio. The Apache Spark model produced the best results across all 15 classes, with accuracy as high as 100% for all classes, as seen by the experiments’ findings. For the F1-score, Apache Spark showed the highest results with 1.00 for most classes. The findings of the three models showed outstanding results for multiclassification network intrusion detection.
doi_str_mv 10.1155/2022/3131153
format Article
fullrecord <record><control><sourceid>gale_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9439899</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A716149063</galeid><sourcerecordid>A716149063</sourcerecordid><originalsourceid>FETCH-LOGICAL-c476t-66b785d0f088276e21cfddd18039f23169735cd0d7282a220b81aec3a94404de3</originalsourceid><addsrcrecordid>eNp9kUtvEzEUhUcIREthxxqNxAYJhvoxtscbpGha2kjhIYWuLce-k7hM7GDPUPHv8TQhPBasfCx_5_henaJ4jtFbjBk7J4iQc4ppvtAHxSnmjagYEfThUXN2UjxJ6RYhJhgij4sTyhGTVLLTop_ttNlAudzp-LXU3pYXALtyATp659flh2ChT2UXYnnt1pvqM8Sst9obKD_CcBeya-6HOCYXfPYOYIZJ3aTJ3S4vq3beVvOLJUG4eVo86nSf4NnhPCtu3l9-aa-rxaereTtbVKYWfKg4X4mGWdShpiGCA8Gms9biBlHZEYq5FJQZi6wgDdGEoFWDNRiqZV2j2gI9K97tc3fjagvWQB5Q92oX3VbHHypop_5-8W6j1uG7kjWVjZQ54NUhIIZvI6RBbV0y0PfaQxiTIgJJiQnlE_ryH_Q2jNHn9e4pJgVuxG9qrXtQznch_2umUDUTmONaIk4z9WZPmRhSitAdR8ZITWWrqWx1KDvjL_5c8wj_ajcDr_fAxnmr79z_434CTBKuFQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2709597187</pqid></control><display><type>article</type><title>Apache Spark and Deep Learning Models for High-Performance Network Intrusion Detection Using CSE-CIC-IDS2018</title><source>MEDLINE</source><source>Wiley Online Library Open Access</source><source>PMC (PubMed Central)</source><source>EZB-FREE-00999 freely available EZB journals</source><source>Alma/SFX Local Collection</source><source>PubMed Central Open Access</source><creator>Hagar, Abdulnaser A. ; Gawali, Bharti W.</creator><contributor>Ijaz, Muhammad Fazal ; Muhammad Fazal Ijaz</contributor><creatorcontrib>Hagar, Abdulnaser A. ; Gawali, Bharti W. ; Ijaz, Muhammad Fazal ; Muhammad Fazal Ijaz</creatorcontrib><description>Keeping computers secure is becoming challenging as networks grow and new network-based technologies emerge. Cybercriminals’ attack surface expands with the release of new internet-enabled products. As many cyberattacks affect businesses’ confidentiality, availability, and integrity, network intrusion detection systems (NIDS) show an essential role. Network-based intrusion detection uses datasets like CSE-CIC-IDS2018 to train prediction models. With fourteen types of attacks included, the latest big data set for intrusion detection is available to the public. This work proposes three models, two deep learning convolutional neural networks (CNN), long short-term memory (LSTM), and Apache Spark, to improve the detection of all types of attacks. To reduce the dimensionality, random forests (RF) was employed to select the important features; it gave 19 from 84 features. The dataset is imbalanced; thus, oversampling and undersampling techniques reduce the imbalance ratio. The Apache Spark model produced the best results across all 15 classes, with accuracy as high as 100% for all classes, as seen by the experiments’ findings. For the F1-score, Apache Spark showed the highest results with 1.00 for most classes. The findings of the three models showed outstanding results for multiclassification network intrusion detection.</description><identifier>ISSN: 1687-5265</identifier><identifier>ISSN: 1687-5273</identifier><identifier>EISSN: 1687-5273</identifier><identifier>DOI: 10.1155/2022/3131153</identifier><identifier>PMID: 36059395</identifier><language>eng</language><publisher>United States: Hindawi</publisher><subject>Accuracy ; Algorithms ; Artificial neural networks ; Availability ; Big Data ; Business metrics ; Classification ; Computer networks ; Computers ; Cyberterrorism ; Datasets ; Deep Learning ; Detectors ; Experiments ; Intrusion detection systems ; Literature reviews ; Long short-term memory ; Machine learning ; Neural networks ; Neural Networks, Computer ; Prediction models</subject><ispartof>Computational intelligence and neuroscience, 2022-08, Vol.2022, p.3131153-11</ispartof><rights>Copyright © 2022 Abdulnaser A. Hagar and Bharti W. Gawali.</rights><rights>COPYRIGHT 2022 John Wiley &amp; Sons, Inc.</rights><rights>Copyright © 2022 Abdulnaser A. Hagar and Bharti W. Gawali. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0</rights><rights>Copyright © 2022 Abdulnaser A. Hagar and Bharti W. Gawali. 2022</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c476t-66b785d0f088276e21cfddd18039f23169735cd0d7282a220b81aec3a94404de3</citedby><cites>FETCH-LOGICAL-c476t-66b785d0f088276e21cfddd18039f23169735cd0d7282a220b81aec3a94404de3</cites><orcidid>0000-0003-3351-0966 ; 0000-0002-8353-5849</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9439899/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9439899/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,27901,27902,53766,53768</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/36059395$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Ijaz, Muhammad Fazal</contributor><contributor>Muhammad Fazal Ijaz</contributor><creatorcontrib>Hagar, Abdulnaser A.</creatorcontrib><creatorcontrib>Gawali, Bharti W.</creatorcontrib><title>Apache Spark and Deep Learning Models for High-Performance Network Intrusion Detection Using CSE-CIC-IDS2018</title><title>Computational intelligence and neuroscience</title><addtitle>Comput Intell Neurosci</addtitle><description>Keeping computers secure is becoming challenging as networks grow and new network-based technologies emerge. Cybercriminals’ attack surface expands with the release of new internet-enabled products. As many cyberattacks affect businesses’ confidentiality, availability, and integrity, network intrusion detection systems (NIDS) show an essential role. Network-based intrusion detection uses datasets like CSE-CIC-IDS2018 to train prediction models. With fourteen types of attacks included, the latest big data set for intrusion detection is available to the public. This work proposes three models, two deep learning convolutional neural networks (CNN), long short-term memory (LSTM), and Apache Spark, to improve the detection of all types of attacks. To reduce the dimensionality, random forests (RF) was employed to select the important features; it gave 19 from 84 features. The dataset is imbalanced; thus, oversampling and undersampling techniques reduce the imbalance ratio. The Apache Spark model produced the best results across all 15 classes, with accuracy as high as 100% for all classes, as seen by the experiments’ findings. For the F1-score, Apache Spark showed the highest results with 1.00 for most classes. The findings of the three models showed outstanding results for multiclassification network intrusion detection.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Artificial neural networks</subject><subject>Availability</subject><subject>Big Data</subject><subject>Business metrics</subject><subject>Classification</subject><subject>Computer networks</subject><subject>Computers</subject><subject>Cyberterrorism</subject><subject>Datasets</subject><subject>Deep Learning</subject><subject>Detectors</subject><subject>Experiments</subject><subject>Intrusion detection systems</subject><subject>Literature reviews</subject><subject>Long short-term memory</subject><subject>Machine learning</subject><subject>Neural networks</subject><subject>Neural Networks, Computer</subject><subject>Prediction models</subject><issn>1687-5265</issn><issn>1687-5273</issn><issn>1687-5273</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RHX</sourceid><sourceid>EIF</sourceid><sourceid>BENPR</sourceid><recordid>eNp9kUtvEzEUhUcIREthxxqNxAYJhvoxtscbpGha2kjhIYWuLce-k7hM7GDPUPHv8TQhPBasfCx_5_henaJ4jtFbjBk7J4iQc4ppvtAHxSnmjagYEfThUXN2UjxJ6RYhJhgij4sTyhGTVLLTop_ttNlAudzp-LXU3pYXALtyATp659flh2ChT2UXYnnt1pvqM8Sst9obKD_CcBeya-6HOCYXfPYOYIZJ3aTJ3S4vq3beVvOLJUG4eVo86nSf4NnhPCtu3l9-aa-rxaereTtbVKYWfKg4X4mGWdShpiGCA8Gms9biBlHZEYq5FJQZi6wgDdGEoFWDNRiqZV2j2gI9K97tc3fjagvWQB5Q92oX3VbHHypop_5-8W6j1uG7kjWVjZQ54NUhIIZvI6RBbV0y0PfaQxiTIgJJiQnlE_ryH_Q2jNHn9e4pJgVuxG9qrXtQznch_2umUDUTmONaIk4z9WZPmRhSitAdR8ZITWWrqWx1KDvjL_5c8wj_ajcDr_fAxnmr79z_434CTBKuFQ</recordid><startdate>20220826</startdate><enddate>20220826</enddate><creator>Hagar, Abdulnaser A.</creator><creator>Gawali, Bharti W.</creator><general>Hindawi</general><general>John Wiley &amp; Sons, Inc</general><general>Hindawi Limited</general><scope>RHU</scope><scope>RHW</scope><scope>RHX</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7QF</scope><scope>7QQ</scope><scope>7SC</scope><scope>7SE</scope><scope>7SP</scope><scope>7SR</scope><scope>7TA</scope><scope>7TB</scope><scope>7TK</scope><scope>7U5</scope><scope>7X7</scope><scope>7XB</scope><scope>8AL</scope><scope>8BQ</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>COVID</scope><scope>CWDGH</scope><scope>DWQXO</scope><scope>F28</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>H8D</scope><scope>H8G</scope><scope>HCIFZ</scope><scope>JG9</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>KR7</scope><scope>L6V</scope><scope>L7M</scope><scope>LK8</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PSYQQ</scope><scope>PTHSS</scope><scope>Q9U</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0003-3351-0966</orcidid><orcidid>https://orcid.org/0000-0002-8353-5849</orcidid></search><sort><creationdate>20220826</creationdate><title>Apache Spark and Deep Learning Models for High-Performance Network Intrusion Detection Using CSE-CIC-IDS2018</title><author>Hagar, Abdulnaser A. ; Gawali, Bharti W.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c476t-66b785d0f088276e21cfddd18039f23169735cd0d7282a220b81aec3a94404de3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Artificial neural networks</topic><topic>Availability</topic><topic>Big Data</topic><topic>Business metrics</topic><topic>Classification</topic><topic>Computer networks</topic><topic>Computers</topic><topic>Cyberterrorism</topic><topic>Datasets</topic><topic>Deep Learning</topic><topic>Detectors</topic><topic>Experiments</topic><topic>Intrusion detection systems</topic><topic>Literature reviews</topic><topic>Long short-term memory</topic><topic>Machine learning</topic><topic>Neural networks</topic><topic>Neural Networks, Computer</topic><topic>Prediction models</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hagar, Abdulnaser A.</creatorcontrib><creatorcontrib>Gawali, Bharti W.</creatorcontrib><collection>Hindawi Publishing Complete</collection><collection>Hindawi Publishing Subscription Journals</collection><collection>Hindawi Publishing Open Access</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Aluminium Industry Abstracts</collection><collection>Ceramic Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Corrosion Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Materials Business File</collection><collection>Mechanical &amp; Transportation Engineering Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Computing Database (Alumni Edition)</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>Coronavirus Research Database</collection><collection>Middle East &amp; Africa Database</collection><collection>ProQuest Central Korea</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>Aerospace Database</collection><collection>Copper Technical Reference Library</collection><collection>SciTech Premium Collection</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Civil Engineering Abstracts</collection><collection>ProQuest Engineering Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>ProQuest Biological Science Collection</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Biological Science Database</collection><collection>Engineering Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest One Psychology</collection><collection>Engineering Collection</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Computational intelligence and neuroscience</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hagar, Abdulnaser A.</au><au>Gawali, Bharti W.</au><au>Ijaz, Muhammad Fazal</au><au>Muhammad Fazal Ijaz</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Apache Spark and Deep Learning Models for High-Performance Network Intrusion Detection Using CSE-CIC-IDS2018</atitle><jtitle>Computational intelligence and neuroscience</jtitle><addtitle>Comput Intell Neurosci</addtitle><date>2022-08-26</date><risdate>2022</risdate><volume>2022</volume><spage>3131153</spage><epage>11</epage><pages>3131153-11</pages><issn>1687-5265</issn><issn>1687-5273</issn><eissn>1687-5273</eissn><abstract>Keeping computers secure is becoming challenging as networks grow and new network-based technologies emerge. Cybercriminals’ attack surface expands with the release of new internet-enabled products. As many cyberattacks affect businesses’ confidentiality, availability, and integrity, network intrusion detection systems (NIDS) show an essential role. Network-based intrusion detection uses datasets like CSE-CIC-IDS2018 to train prediction models. With fourteen types of attacks included, the latest big data set for intrusion detection is available to the public. This work proposes three models, two deep learning convolutional neural networks (CNN), long short-term memory (LSTM), and Apache Spark, to improve the detection of all types of attacks. To reduce the dimensionality, random forests (RF) was employed to select the important features; it gave 19 from 84 features. The dataset is imbalanced; thus, oversampling and undersampling techniques reduce the imbalance ratio. The Apache Spark model produced the best results across all 15 classes, with accuracy as high as 100% for all classes, as seen by the experiments’ findings. For the F1-score, Apache Spark showed the highest results with 1.00 for most classes. The findings of the three models showed outstanding results for multiclassification network intrusion detection.</abstract><cop>United States</cop><pub>Hindawi</pub><pmid>36059395</pmid><doi>10.1155/2022/3131153</doi><tpages>11</tpages><orcidid>https://orcid.org/0000-0003-3351-0966</orcidid><orcidid>https://orcid.org/0000-0002-8353-5849</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1687-5265
ispartof Computational intelligence and neuroscience, 2022-08, Vol.2022, p.3131153-11
issn 1687-5265
1687-5273
1687-5273
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9439899
source MEDLINE; Wiley Online Library Open Access; PMC (PubMed Central); EZB-FREE-00999 freely available EZB journals; Alma/SFX Local Collection; PubMed Central Open Access
subjects Accuracy
Algorithms
Artificial neural networks
Availability
Big Data
Business metrics
Classification
Computer networks
Computers
Cyberterrorism
Datasets
Deep Learning
Detectors
Experiments
Intrusion detection systems
Literature reviews
Long short-term memory
Machine learning
Neural networks
Neural Networks, Computer
Prediction models
title Apache Spark and Deep Learning Models for High-Performance Network Intrusion Detection Using CSE-CIC-IDS2018
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-01T00%3A40%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Apache%20Spark%20and%20Deep%20Learning%20Models%20for%20High-Performance%20Network%20Intrusion%20Detection%20Using%20CSE-CIC-IDS2018&rft.jtitle=Computational%20intelligence%20and%20neuroscience&rft.au=Hagar,%20Abdulnaser%20A.&rft.date=2022-08-26&rft.volume=2022&rft.spage=3131153&rft.epage=11&rft.pages=3131153-11&rft.issn=1687-5265&rft.eissn=1687-5273&rft_id=info:doi/10.1155/2022/3131153&rft_dat=%3Cgale_pubme%3EA716149063%3C/gale_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2709597187&rft_id=info:pmid/36059395&rft_galeid=A716149063&rfr_iscdi=true