Enhancing network intrusion detection: a dual-ensemble approach with CTGAN-balanced data and weak classifiers

With the expansion of the Internet, Internet of Things devices, and related services, effective intrusion detection systems are vital in cybersecurity. This study presents a significant advancement in cybersecurity by leveraging ensemble learning techniques alongside generative adversarial networks,...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Journal of supercomputing 2024, Vol.80 (11), p.16301-16333
Hauptverfasser: Soflaei, Mohammad Reza Abbaszadeh Bavil, Salehpour, Arash, Samadzamini, Karim
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 16333
container_issue 11
container_start_page 16301
container_title The Journal of supercomputing
container_volume 80
creator Soflaei, Mohammad Reza Abbaszadeh Bavil
Salehpour, Arash
Samadzamini, Karim
description With the expansion of the Internet, Internet of Things devices, and related services, effective intrusion detection systems are vital in cybersecurity. This study presents a significant advancement in cybersecurity by leveraging ensemble learning techniques alongside generative adversarial networks, proposing a novel framework for network behavior classification using the UNSW-NB15 dataset. Similar to any other real-world dataset, the UNSW-NB15 dataset poses inherent challenges of data imbalance, with significantly fewer instances of intrusion compared to normal network behavior. Our main contribution to the existing literature is the introduction of a conditional tabular generative adversarial network (CTGAN), aimed at addressing the existing issue of data imbalance in the dataset. In previous approaches, this issue was often overlooked; however, the proposed framework achieves a substantial improvement in model performance by balancing this dataset. Through training three shallow binary classification algorithms (decision trees, logistic regression, and Gaussian naive Bayes) on both the CTGAN-balanced data and the original imbalanced dataset, we uncover remarkable improvements in identifying network intrusion. Our study employs a novel two-stage label-wise ensembling process, notably resulting in a final XGBoost meta-classifier. The ultimate achievement of our framework demonstrates 98% accuracy for binary classification and 95% for multi-class classification, outperforming existing state-of-the-art models. By offering a robust framework for effective intrusion detection, this work marks a substantial step forward in addressing data imbalance challenges within the UNSW-NB15 dataset.
doi_str_mv 10.1007/s11227-024-06108-7
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3072276471</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3072276471</sourcerecordid><originalsourceid>FETCH-LOGICAL-c270t-f309c6809b57605bf6e134208f36fda282e70ffd2ee4f6e58a7c5027da7a47e03</originalsourceid><addsrcrecordid>eNp9kM1LAzEQxYMoWKv_gKeA5-jkYzdbb6XUDyh60XOY7k7s2m22JluK_73RCt48zcC89-bxY-xSwrUEsDdJSqWsAGUElBIqYY_YSBZWCzCVOWYjmCgQVWHUKTtL6R0AjLZ6xDbzsMJQt-GNBxr2fVzzNgxxl9o-8IYGqoe83XLkzQ47QSHRZtkRx-029liv-L4dVnz2cj99EkvschQ1vMEBOYaG7wnXvO4wpda3FNM5O_HYJbr4nWP2ejd_mT2IxfP942y6ELWyMAivYVKXFUyWhS2hWPqSpDYKKq9L36CqFFnwvlFEJt-KCm1dgLINWjSWQI_Z1SE3l_zYURrce7-LIb90GmwmVRors0odVHXsU4rk3Ta2G4yfToL7xuoOWF3G6n6wOptN-mBKWRzeKP5F_-P6ApiWe1Q</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3072276471</pqid></control><display><type>article</type><title>Enhancing network intrusion detection: a dual-ensemble approach with CTGAN-balanced data and weak classifiers</title><source>SpringerLink Journals - AutoHoldings</source><creator>Soflaei, Mohammad Reza Abbaszadeh Bavil ; Salehpour, Arash ; Samadzamini, Karim</creator><creatorcontrib>Soflaei, Mohammad Reza Abbaszadeh Bavil ; Salehpour, Arash ; Samadzamini, Karim</creatorcontrib><description>With the expansion of the Internet, Internet of Things devices, and related services, effective intrusion detection systems are vital in cybersecurity. This study presents a significant advancement in cybersecurity by leveraging ensemble learning techniques alongside generative adversarial networks, proposing a novel framework for network behavior classification using the UNSW-NB15 dataset. Similar to any other real-world dataset, the UNSW-NB15 dataset poses inherent challenges of data imbalance, with significantly fewer instances of intrusion compared to normal network behavior. Our main contribution to the existing literature is the introduction of a conditional tabular generative adversarial network (CTGAN), aimed at addressing the existing issue of data imbalance in the dataset. In previous approaches, this issue was often overlooked; however, the proposed framework achieves a substantial improvement in model performance by balancing this dataset. Through training three shallow binary classification algorithms (decision trees, logistic regression, and Gaussian naive Bayes) on both the CTGAN-balanced data and the original imbalanced dataset, we uncover remarkable improvements in identifying network intrusion. Our study employs a novel two-stage label-wise ensembling process, notably resulting in a final XGBoost meta-classifier. The ultimate achievement of our framework demonstrates 98% accuracy for binary classification and 95% for multi-class classification, outperforming existing state-of-the-art models. By offering a robust framework for effective intrusion detection, this work marks a substantial step forward in addressing data imbalance challenges within the UNSW-NB15 dataset.</description><identifier>ISSN: 0920-8542</identifier><identifier>EISSN: 1573-0484</identifier><identifier>DOI: 10.1007/s11227-024-06108-7</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Algorithms ; Classification ; Classifiers ; Compilers ; Computer Science ; Cybersecurity ; Datasets ; Decision trees ; Ensemble learning ; Generative adversarial networks ; Internet of Things ; Interpreters ; Intrusion detection systems ; Machine learning ; Processor Architectures ; Programming Languages ; System effectiveness</subject><ispartof>The Journal of supercomputing, 2024, Vol.80 (11), p.16301-16333</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c270t-f309c6809b57605bf6e134208f36fda282e70ffd2ee4f6e58a7c5027da7a47e03</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11227-024-06108-7$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11227-024-06108-7$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Soflaei, Mohammad Reza Abbaszadeh Bavil</creatorcontrib><creatorcontrib>Salehpour, Arash</creatorcontrib><creatorcontrib>Samadzamini, Karim</creatorcontrib><title>Enhancing network intrusion detection: a dual-ensemble approach with CTGAN-balanced data and weak classifiers</title><title>The Journal of supercomputing</title><addtitle>J Supercomput</addtitle><description>With the expansion of the Internet, Internet of Things devices, and related services, effective intrusion detection systems are vital in cybersecurity. This study presents a significant advancement in cybersecurity by leveraging ensemble learning techniques alongside generative adversarial networks, proposing a novel framework for network behavior classification using the UNSW-NB15 dataset. Similar to any other real-world dataset, the UNSW-NB15 dataset poses inherent challenges of data imbalance, with significantly fewer instances of intrusion compared to normal network behavior. Our main contribution to the existing literature is the introduction of a conditional tabular generative adversarial network (CTGAN), aimed at addressing the existing issue of data imbalance in the dataset. In previous approaches, this issue was often overlooked; however, the proposed framework achieves a substantial improvement in model performance by balancing this dataset. Through training three shallow binary classification algorithms (decision trees, logistic regression, and Gaussian naive Bayes) on both the CTGAN-balanced data and the original imbalanced dataset, we uncover remarkable improvements in identifying network intrusion. Our study employs a novel two-stage label-wise ensembling process, notably resulting in a final XGBoost meta-classifier. The ultimate achievement of our framework demonstrates 98% accuracy for binary classification and 95% for multi-class classification, outperforming existing state-of-the-art models. By offering a robust framework for effective intrusion detection, this work marks a substantial step forward in addressing data imbalance challenges within the UNSW-NB15 dataset.</description><subject>Algorithms</subject><subject>Classification</subject><subject>Classifiers</subject><subject>Compilers</subject><subject>Computer Science</subject><subject>Cybersecurity</subject><subject>Datasets</subject><subject>Decision trees</subject><subject>Ensemble learning</subject><subject>Generative adversarial networks</subject><subject>Internet of Things</subject><subject>Interpreters</subject><subject>Intrusion detection systems</subject><subject>Machine learning</subject><subject>Processor Architectures</subject><subject>Programming Languages</subject><subject>System effectiveness</subject><issn>0920-8542</issn><issn>1573-0484</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kM1LAzEQxYMoWKv_gKeA5-jkYzdbb6XUDyh60XOY7k7s2m22JluK_73RCt48zcC89-bxY-xSwrUEsDdJSqWsAGUElBIqYY_YSBZWCzCVOWYjmCgQVWHUKTtL6R0AjLZ6xDbzsMJQt-GNBxr2fVzzNgxxl9o-8IYGqoe83XLkzQ47QSHRZtkRx-029liv-L4dVnz2cj99EkvschQ1vMEBOYaG7wnXvO4wpda3FNM5O_HYJbr4nWP2ejd_mT2IxfP942y6ELWyMAivYVKXFUyWhS2hWPqSpDYKKq9L36CqFFnwvlFEJt-KCm1dgLINWjSWQI_Z1SE3l_zYURrce7-LIb90GmwmVRors0odVHXsU4rk3Ta2G4yfToL7xuoOWF3G6n6wOptN-mBKWRzeKP5F_-P6ApiWe1Q</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Soflaei, Mohammad Reza Abbaszadeh Bavil</creator><creator>Salehpour, Arash</creator><creator>Samadzamini, Karim</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>2024</creationdate><title>Enhancing network intrusion detection: a dual-ensemble approach with CTGAN-balanced data and weak classifiers</title><author>Soflaei, Mohammad Reza Abbaszadeh Bavil ; Salehpour, Arash ; Samadzamini, Karim</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c270t-f309c6809b57605bf6e134208f36fda282e70ffd2ee4f6e58a7c5027da7a47e03</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Classification</topic><topic>Classifiers</topic><topic>Compilers</topic><topic>Computer Science</topic><topic>Cybersecurity</topic><topic>Datasets</topic><topic>Decision trees</topic><topic>Ensemble learning</topic><topic>Generative adversarial networks</topic><topic>Internet of Things</topic><topic>Interpreters</topic><topic>Intrusion detection systems</topic><topic>Machine learning</topic><topic>Processor Architectures</topic><topic>Programming Languages</topic><topic>System effectiveness</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Soflaei, Mohammad Reza Abbaszadeh Bavil</creatorcontrib><creatorcontrib>Salehpour, Arash</creatorcontrib><creatorcontrib>Samadzamini, Karim</creatorcontrib><collection>CrossRef</collection><jtitle>The Journal of supercomputing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Soflaei, Mohammad Reza Abbaszadeh Bavil</au><au>Salehpour, Arash</au><au>Samadzamini, Karim</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Enhancing network intrusion detection: a dual-ensemble approach with CTGAN-balanced data and weak classifiers</atitle><jtitle>The Journal of supercomputing</jtitle><stitle>J Supercomput</stitle><date>2024</date><risdate>2024</risdate><volume>80</volume><issue>11</issue><spage>16301</spage><epage>16333</epage><pages>16301-16333</pages><issn>0920-8542</issn><eissn>1573-0484</eissn><abstract>With the expansion of the Internet, Internet of Things devices, and related services, effective intrusion detection systems are vital in cybersecurity. This study presents a significant advancement in cybersecurity by leveraging ensemble learning techniques alongside generative adversarial networks, proposing a novel framework for network behavior classification using the UNSW-NB15 dataset. Similar to any other real-world dataset, the UNSW-NB15 dataset poses inherent challenges of data imbalance, with significantly fewer instances of intrusion compared to normal network behavior. Our main contribution to the existing literature is the introduction of a conditional tabular generative adversarial network (CTGAN), aimed at addressing the existing issue of data imbalance in the dataset. In previous approaches, this issue was often overlooked; however, the proposed framework achieves a substantial improvement in model performance by balancing this dataset. Through training three shallow binary classification algorithms (decision trees, logistic regression, and Gaussian naive Bayes) on both the CTGAN-balanced data and the original imbalanced dataset, we uncover remarkable improvements in identifying network intrusion. Our study employs a novel two-stage label-wise ensembling process, notably resulting in a final XGBoost meta-classifier. The ultimate achievement of our framework demonstrates 98% accuracy for binary classification and 95% for multi-class classification, outperforming existing state-of-the-art models. By offering a robust framework for effective intrusion detection, this work marks a substantial step forward in addressing data imbalance challenges within the UNSW-NB15 dataset.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11227-024-06108-7</doi><tpages>33</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0920-8542
ispartof The Journal of supercomputing, 2024, Vol.80 (11), p.16301-16333
issn 0920-8542
1573-0484
language eng
recordid cdi_proquest_journals_3072276471
source SpringerLink Journals - AutoHoldings
subjects Algorithms
Classification
Classifiers
Compilers
Computer Science
Cybersecurity
Datasets
Decision trees
Ensemble learning
Generative adversarial networks
Internet of Things
Interpreters
Intrusion detection systems
Machine learning
Processor Architectures
Programming Languages
System effectiveness
title Enhancing network intrusion detection: a dual-ensemble approach with CTGAN-balanced data and weak classifiers
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T11%3A00%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Enhancing%20network%20intrusion%20detection:%20a%20dual-ensemble%20approach%20with%20CTGAN-balanced%20data%20and%20weak%20classifiers&rft.jtitle=The%20Journal%20of%20supercomputing&rft.au=Soflaei,%20Mohammad%20Reza%20Abbaszadeh%20Bavil&rft.date=2024&rft.volume=80&rft.issue=11&rft.spage=16301&rft.epage=16333&rft.pages=16301-16333&rft.issn=0920-8542&rft.eissn=1573-0484&rft_id=info:doi/10.1007/s11227-024-06108-7&rft_dat=%3Cproquest_cross%3E3072276471%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3072276471&rft_id=info:pmid/&rfr_iscdi=true