Network anomaly detection based on selective ensemble algorithm

In order to reduce the loss of information of the majority class samples in the resampling process, combining the distribution of class samples and the characteristics of ensemble learning algorithm, in this paper, a two-level selective ensemble learning algorithm for imbalanced datasets is proposed...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Journal of supercomputing 2021-03, Vol.77 (3), p.2875-2896
Hauptverfasser: Du, Hongle, Zhang, Yan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 2896
container_issue 3
container_start_page 2875
container_title The Journal of supercomputing
container_volume 77
creator Du, Hongle
Zhang, Yan
description In order to reduce the loss of information of the majority class samples in the resampling process, combining the distribution of class samples and the characteristics of ensemble learning algorithm, in this paper, a two-level selective ensemble learning algorithm for imbalanced datasets is proposed. Firstly, the algorithm under-samples the majority class samples and constructs multiple training subsets. The training process will generate multiple base classifiers using AdaBoost algorithm, then select some base classifiers according to maximum correlation and minimum redundancy criteria, and form sub-classifiers according to weighted integration. Then, generate multiple sub-classifiers for multiple training subsets, and then, select some sub-classifiers according to maximum correlation and minimum redundancy criteria. Then, the weights of the selected sub-classifiers are calculated by F-means or G-means, and the ensemble classifier is obtained by weighted voting. Finally, the improved algorithm for imbalanced dataset is applied to the network anomaly detection. The experimental results on UCI datasets show that this method can improve the classification performance to a certain extent, especially for imbalanced datasets. Finally, the algorithm is applied to network anomaly detection for Internet of Things. From the simulation data of KDDCUP99 dataset, we can see that TLSE-ID algorithm has a small missing report rate and high precision.
doi_str_mv 10.1007/s11227-020-03374-z
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2489026891</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2489026891</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-df8af42f479e53a286e0fbf438854004271b15af12e85b722124606f1a87f48a3</originalsourceid><addsrcrecordid>eNp9kE1LAzEQhoMoWKt_wNOC5-hkkt1kTyLFLyh60XPItpPauh812Srtrzd1BW-eZph535mXh7FzAZcCQF9FIRA1BwQOUmrFdwdsJHItOSijDtkIyrQyucJjdhLjCgCU1HLErp-o_-rCe-barnH1NptTT7N-2bVZ5SLNs9REqvejT8qojdRUNWWuXnRh2b81p-zIuzrS2W8ds9e725fJA58-3z9ObqZ8JkXZ87k3ziv0SpeUS4emIPCVV9KkTCkLalGJ3HmBZPJKIwpUBRReOKO9Mk6O2cVwdx26jw3F3q66TWjTS4vKlICFKUVS4aCahS7GQN6uw7JxYWsF2D0oO4CyCZT9AWV3ySQHU0zidkHh7_Q_rm-ygWsG</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2489026891</pqid></control><display><type>article</type><title>Network anomaly detection based on selective ensemble algorithm</title><source>SpringerLink Journals - AutoHoldings</source><creator>Du, Hongle ; Zhang, Yan</creator><creatorcontrib>Du, Hongle ; Zhang, Yan</creatorcontrib><description>In order to reduce the loss of information of the majority class samples in the resampling process, combining the distribution of class samples and the characteristics of ensemble learning algorithm, in this paper, a two-level selective ensemble learning algorithm for imbalanced datasets is proposed. Firstly, the algorithm under-samples the majority class samples and constructs multiple training subsets. The training process will generate multiple base classifiers using AdaBoost algorithm, then select some base classifiers according to maximum correlation and minimum redundancy criteria, and form sub-classifiers according to weighted integration. Then, generate multiple sub-classifiers for multiple training subsets, and then, select some sub-classifiers according to maximum correlation and minimum redundancy criteria. Then, the weights of the selected sub-classifiers are calculated by F-means or G-means, and the ensemble classifier is obtained by weighted voting. Finally, the improved algorithm for imbalanced dataset is applied to the network anomaly detection. The experimental results on UCI datasets show that this method can improve the classification performance to a certain extent, especially for imbalanced datasets. Finally, the algorithm is applied to network anomaly detection for Internet of Things. From the simulation data of KDDCUP99 dataset, we can see that TLSE-ID algorithm has a small missing report rate and high precision.</description><identifier>ISSN: 0920-8542</identifier><identifier>EISSN: 1573-0484</identifier><identifier>DOI: 10.1007/s11227-020-03374-z</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Algorithms ; Anomalies ; Classifiers ; Compilers ; Computer Science ; Datasets ; Deep Learning in IoT: Emerging Trends and Applications - 2019 ; Internet of Things ; Interpreters ; Machine learning ; Processor Architectures ; Programming Languages ; Redundancy ; Resampling ; Training</subject><ispartof>The Journal of supercomputing, 2021-03, Vol.77 (3), p.2875-2896</ispartof><rights>Springer Science+Business Media, LLC, part of Springer Nature 2020</rights><rights>Springer Science+Business Media, LLC, part of Springer Nature 2020.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-df8af42f479e53a286e0fbf438854004271b15af12e85b722124606f1a87f48a3</citedby><cites>FETCH-LOGICAL-c319t-df8af42f479e53a286e0fbf438854004271b15af12e85b722124606f1a87f48a3</cites><orcidid>0000-0002-6417-3600</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11227-020-03374-z$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11227-020-03374-z$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Du, Hongle</creatorcontrib><creatorcontrib>Zhang, Yan</creatorcontrib><title>Network anomaly detection based on selective ensemble algorithm</title><title>The Journal of supercomputing</title><addtitle>J Supercomput</addtitle><description>In order to reduce the loss of information of the majority class samples in the resampling process, combining the distribution of class samples and the characteristics of ensemble learning algorithm, in this paper, a two-level selective ensemble learning algorithm for imbalanced datasets is proposed. Firstly, the algorithm under-samples the majority class samples and constructs multiple training subsets. The training process will generate multiple base classifiers using AdaBoost algorithm, then select some base classifiers according to maximum correlation and minimum redundancy criteria, and form sub-classifiers according to weighted integration. Then, generate multiple sub-classifiers for multiple training subsets, and then, select some sub-classifiers according to maximum correlation and minimum redundancy criteria. Then, the weights of the selected sub-classifiers are calculated by F-means or G-means, and the ensemble classifier is obtained by weighted voting. Finally, the improved algorithm for imbalanced dataset is applied to the network anomaly detection. The experimental results on UCI datasets show that this method can improve the classification performance to a certain extent, especially for imbalanced datasets. Finally, the algorithm is applied to network anomaly detection for Internet of Things. From the simulation data of KDDCUP99 dataset, we can see that TLSE-ID algorithm has a small missing report rate and high precision.</description><subject>Algorithms</subject><subject>Anomalies</subject><subject>Classifiers</subject><subject>Compilers</subject><subject>Computer Science</subject><subject>Datasets</subject><subject>Deep Learning in IoT: Emerging Trends and Applications - 2019</subject><subject>Internet of Things</subject><subject>Interpreters</subject><subject>Machine learning</subject><subject>Processor Architectures</subject><subject>Programming Languages</subject><subject>Redundancy</subject><subject>Resampling</subject><subject>Training</subject><issn>0920-8542</issn><issn>1573-0484</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNp9kE1LAzEQhoMoWKt_wNOC5-hkkt1kTyLFLyh60XPItpPauh812Srtrzd1BW-eZph535mXh7FzAZcCQF9FIRA1BwQOUmrFdwdsJHItOSijDtkIyrQyucJjdhLjCgCU1HLErp-o_-rCe-barnH1NptTT7N-2bVZ5SLNs9REqvejT8qojdRUNWWuXnRh2b81p-zIuzrS2W8ds9e725fJA58-3z9ObqZ8JkXZ87k3ziv0SpeUS4emIPCVV9KkTCkLalGJ3HmBZPJKIwpUBRReOKO9Mk6O2cVwdx26jw3F3q66TWjTS4vKlICFKUVS4aCahS7GQN6uw7JxYWsF2D0oO4CyCZT9AWV3ySQHU0zidkHh7_Q_rm-ygWsG</recordid><startdate>20210301</startdate><enddate>20210301</enddate><creator>Du, Hongle</creator><creator>Zhang, Yan</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-6417-3600</orcidid></search><sort><creationdate>20210301</creationdate><title>Network anomaly detection based on selective ensemble algorithm</title><author>Du, Hongle ; Zhang, Yan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-df8af42f479e53a286e0fbf438854004271b15af12e85b722124606f1a87f48a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Anomalies</topic><topic>Classifiers</topic><topic>Compilers</topic><topic>Computer Science</topic><topic>Datasets</topic><topic>Deep Learning in IoT: Emerging Trends and Applications - 2019</topic><topic>Internet of Things</topic><topic>Interpreters</topic><topic>Machine learning</topic><topic>Processor Architectures</topic><topic>Programming Languages</topic><topic>Redundancy</topic><topic>Resampling</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Du, Hongle</creatorcontrib><creatorcontrib>Zhang, Yan</creatorcontrib><collection>CrossRef</collection><jtitle>The Journal of supercomputing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Du, Hongle</au><au>Zhang, Yan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Network anomaly detection based on selective ensemble algorithm</atitle><jtitle>The Journal of supercomputing</jtitle><stitle>J Supercomput</stitle><date>2021-03-01</date><risdate>2021</risdate><volume>77</volume><issue>3</issue><spage>2875</spage><epage>2896</epage><pages>2875-2896</pages><issn>0920-8542</issn><eissn>1573-0484</eissn><abstract>In order to reduce the loss of information of the majority class samples in the resampling process, combining the distribution of class samples and the characteristics of ensemble learning algorithm, in this paper, a two-level selective ensemble learning algorithm for imbalanced datasets is proposed. Firstly, the algorithm under-samples the majority class samples and constructs multiple training subsets. The training process will generate multiple base classifiers using AdaBoost algorithm, then select some base classifiers according to maximum correlation and minimum redundancy criteria, and form sub-classifiers according to weighted integration. Then, generate multiple sub-classifiers for multiple training subsets, and then, select some sub-classifiers according to maximum correlation and minimum redundancy criteria. Then, the weights of the selected sub-classifiers are calculated by F-means or G-means, and the ensemble classifier is obtained by weighted voting. Finally, the improved algorithm for imbalanced dataset is applied to the network anomaly detection. The experimental results on UCI datasets show that this method can improve the classification performance to a certain extent, especially for imbalanced datasets. Finally, the algorithm is applied to network anomaly detection for Internet of Things. From the simulation data of KDDCUP99 dataset, we can see that TLSE-ID algorithm has a small missing report rate and high precision.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11227-020-03374-z</doi><tpages>22</tpages><orcidid>https://orcid.org/0000-0002-6417-3600</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0920-8542
ispartof The Journal of supercomputing, 2021-03, Vol.77 (3), p.2875-2896
issn 0920-8542
1573-0484
language eng
recordid cdi_proquest_journals_2489026891
source SpringerLink Journals - AutoHoldings
subjects Algorithms
Anomalies
Classifiers
Compilers
Computer Science
Datasets
Deep Learning in IoT: Emerging Trends and Applications - 2019
Internet of Things
Interpreters
Machine learning
Processor Architectures
Programming Languages
Redundancy
Resampling
Training
title Network anomaly detection based on selective ensemble algorithm
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T20%3A45%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Network%20anomaly%20detection%20based%20on%20selective%20ensemble%20algorithm&rft.jtitle=The%20Journal%20of%20supercomputing&rft.au=Du,%20Hongle&rft.date=2021-03-01&rft.volume=77&rft.issue=3&rft.spage=2875&rft.epage=2896&rft.pages=2875-2896&rft.issn=0920-8542&rft.eissn=1573-0484&rft_id=info:doi/10.1007/s11227-020-03374-z&rft_dat=%3Cproquest_cross%3E2489026891%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2489026891&rft_id=info:pmid/&rfr_iscdi=true