Network anomaly detection based on selective ensemble algorithm

In order to reduce the loss of information of the majority class samples in the resampling process, combining the distribution of class samples and the characteristics of ensemble learning algorithm, in this paper, a two-level selective ensemble learning algorithm for imbalanced datasets is proposed...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Journal of supercomputing 2021-03, Vol.77 (3), p.2875-2896
Hauptverfasser:	Du, Hongle, Zhang, Yan
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Anomalies Classifiers Compilers Computer Science Datasets Deep Learning in IoT: Emerging Trends and Applications - 2019 Internet of Things Interpreters Machine learning Processor Architectures Programming Languages Redundancy Resampling Training
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	2896
container_issue	3
container_start_page	2875
container_title	The Journal of supercomputing
container_volume	77
creator	Du, Hongle Zhang, Yan
description	In order to reduce the loss of information of the majority class samples in the resampling process, combining the distribution of class samples and the characteristics of ensemble learning algorithm, in this paper, a two-level selective ensemble learning algorithm for imbalanced datasets is proposed. Firstly, the algorithm under-samples the majority class samples and constructs multiple training subsets. The training process will generate multiple base classifiers using AdaBoost algorithm, then select some base classifiers according to maximum correlation and minimum redundancy criteria, and form sub-classifiers according to weighted integration. Then, generate multiple sub-classifiers for multiple training subsets, and then, select some sub-classifiers according to maximum correlation and minimum redundancy criteria. Then, the weights of the selected sub-classifiers are calculated by F-means or G-means, and the ensemble classifier is obtained by weighted voting. Finally, the improved algorithm for imbalanced dataset is applied to the network anomaly detection. The experimental results on UCI datasets show that this method can improve the classification performance to a certain extent, especially for imbalanced datasets. Finally, the algorithm is applied to network anomaly detection for Internet of Things. From the simulation data of KDDCUP99 dataset, we can see that TLSE-ID algorithm has a small missing report rate and high precision.
doi_str_mv	10.1007/s11227-020-03374-z
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2489026891</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2489026891</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-df8af42f479e53a286e0fbf438854004271b15af12e85b722124606f1a87f48a3</originalsourceid><addsrcrecordid>eNp9kE1LAzEQhoMoWKt_wNOC5-hkkt1kTyLFLyh60XPItpPauh812Srtrzd1BW-eZph535mXh7FzAZcCQF9FIRA1BwQOUmrFdwdsJHItOSijDtkIyrQyucJjdhLjCgCU1HLErp-o_-rCe-barnH1NptTT7N-2bVZ5SLNs9REqvejT8qojdRUNWWuXnRh2b81p-zIuzrS2W8ds9e725fJA58-3z9ObqZ8JkXZ87k3ziv0SpeUS4emIPCVV9KkTCkLalGJ3HmBZPJKIwpUBRReOKO9Mk6O2cVwdx26jw3F3q66TWjTS4vKlICFKUVS4aCahS7GQN6uw7JxYWsF2D0oO4CyCZT9AWV3ySQHU0zidkHh7_Q_rm-ygWsG</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2489026891</pqid></control><display><type>article</type><title>Network anomaly detection based on selective ensemble algorithm</title><source>SpringerLink Journals - AutoHoldings</source><creator>Du, Hongle ; Zhang, Yan</creator><creatorcontrib>Du, Hongle ; Zhang, Yan</creatorcontrib><description>In order to reduce the loss of information of the majority class samples in the resampling process, combining the distribution of class samples and the characteristics of ensemble learning algorithm, in this paper, a two-level selective ensemble learning algorithm for imbalanced datasets is proposed. Firstly, the algorithm under-samples the majority class samples and constructs multiple training subsets. The training process will generate multiple base classifiers using AdaBoost algorithm, then select some base classifiers according to maximum correlation and minimum redundancy criteria, and form sub-classifiers according to weighted integration. Then, generate multiple sub-classifiers for multiple training subsets, and then, select some sub-classifiers according to maximum correlation and minimum redundancy criteria. Then, the weights of the selected sub-classifiers are calculated by F-means or G-means, and the ensemble classifier is obtained by weighted voting. Finally, the improved algorithm for imbalanced dataset is applied to the network anomaly detection. The experimental results on UCI datasets show that this method can improve the classification performance to a certain extent, especially for imbalanced datasets. Finally, the algorithm is applied to network anomaly detection for Internet of Things. From the simulation data of KDDCUP99 dataset, we can see that TLSE-ID algorithm has a small missing report rate and high precision.</description><identifier>ISSN: 0920-8542</identifier><identifier>EISSN: 1573-0484</identifier><identifier>DOI: 10.1007/s11227-020-03374-z</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Algorithms ; Anomalies ; Classifiers ; Compilers ; Computer Science ; Datasets ; Deep Learning in IoT: Emerging Trends and Applications - 2019 ; Internet of Things ; Interpreters ; Machine learning ; Processor Architectures ; Programming Languages ; Redundancy ; Resampling ; Training</subject><ispartof>The Journal of supercomputing, 2021-03, Vol.77 (3), p.2875-2896</ispartof><rights>Springer Science+Business Media, LLC, part of Springer Nature 2020</rights><rights>Springer Science+Business Media, LLC, part of Springer Nature 2020.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-df8af42f479e53a286e0fbf438854004271b15af12e85b722124606f1a87f48a3</citedby><cites>FETCH-LOGICAL-c319t-df8af42f479e53a286e0fbf438854004271b15af12e85b722124606f1a87f48a3</cites><orcidid>0000-0002-6417-3600</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11227-020-03374-z$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11227-020-03374-z$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Du, Hongle</creatorcontrib><creatorcontrib>Zhang, Yan</creatorcontrib><title>Network anomaly detection based on selective ensemble algorithm</title><title>The Journal of supercomputing</title><addtitle>J Supercomput</addtitle><description>In order to reduce the loss of information of the majority class samples in the resampling process, combining the distribution of class samples and the characteristics of ensemble learning algorithm, in this paper, a two-level selective ensemble learning algorithm for imbalanced datasets is proposed. Firstly, the algorithm under-samples the majority class samples and constructs multiple training subsets. The training process will generate multiple base classifiers using AdaBoost algorithm, then select some base classifiers according to maximum correlation and minimum redundancy criteria, and form sub-classifiers according to weighted integration. Then, generate multiple sub-classifiers for multiple training subsets, and then, select some sub-classifiers according to maximum correlation and minimum redundancy criteria. Then, the weights of the selected sub-classifiers are calculated by F-means or G-means, and the ensemble classifier is obtained by weighted voting. Finally, the improved algorithm for imbalanced dataset is applied to the network anomaly detection. The experimental results on UCI datasets show that this method can improve the classification performance to a certain extent, especially for imbalanced datasets. Finally, the algorithm is applied to network anomaly detection for Internet of Things. From the simulation data of KDDCUP99 dataset, we can see that TLSE-ID algorithm has a small missing report rate and high precision.</description><subject>Algorithms</subject><subject>Anomalies</subject><subject>Classifiers</subject><subject>Compilers</subject><subject>Computer Science</subject><subject>Datasets</subject><subject>Deep Learning in IoT: Emerging Trends and Applications - 2019</subject><subject>Internet of Things</subject><subject>Interpreters</subject><subject>Machine learning</subject><subject>Processor Architectures</subject><subject>Programming Languages</subject><subject>Redundancy</subject><subject>Resampling</subject><subject>Training</subject><issn>0920-8542</issn><issn>1573-0484</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNp9kE1LAzEQhoMoWKt_wNOC5-hkkt1kTyLFLyh60XPItpPauh812Srtrzd1BW-eZph535mXh7FzAZcCQF9FIRA1BwQOUmrFdwdsJHItOSijDtkIyrQyucJjdhLjCgCU1HLErp-o_-rCe-barnH1NptTT7N-2bVZ5SLNs9REqvejT8qojdRUNWWuXnRh2b81p-zIuzrS2W8ds9e725fJA58-3z9ObqZ8JkXZ87k3ziv0SpeUS4emIPCVV9KkTCkLalGJ3HmBZPJKIwpUBRReOKO9Mk6O2cVwdx26jw3F3q66TWjTS4vKlICFKUVS4aCahS7GQN6uw7JxYWsF2D0oO4CyCZT9AWV3ySQHU0zidkHh7_Q_rm-ygWsG</recordid><startdate>20210301</startdate><enddate>20210301</enddate><creator>Du, Hongle</creator><creator>Zhang, Yan</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-6417-3600</orcidid></search><sort><creationdate>20210301</creationdate><title>Network anomaly detection based on selective ensemble algorithm</title><author>Du, Hongle ; Zhang, Yan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-df8af42f479e53a286e0fbf438854004271b15af12e85b722124606f1a87f48a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Anomalies</topic><topic>Classifiers</topic><topic>Compilers</topic><topic>Computer Science</topic><topic>Datasets</topic><topic>Deep Learning in IoT: Emerging Trends and Applications - 2019</topic><topic>Internet of Things</topic><topic>Interpreters</topic><topic>Machine learning</topic><topic>Processor Architectures</topic><topic>Programming Languages</topic><topic>Redundancy</topic><topic>Resampling</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Du, Hongle</creatorcontrib><creatorcontrib>Zhang, Yan</creatorcontrib><collection>CrossRef</collection><jtitle>The Journal of supercomputing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Du, Hongle</au><au>Zhang, Yan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Network anomaly detection based on selective ensemble algorithm</atitle><jtitle>The Journal of supercomputing</jtitle><stitle>J Supercomput</stitle><date>2021-03-01</date><risdate>2021</risdate><volume>77</volume><issue>3</issue><spage>2875</spage><epage>2896</epage><pages>2875-2896</pages><issn>0920-8542</issn><eissn>1573-0484</eissn><abstract>In order to reduce the loss of information of the majority class samples in the resampling process, combining the distribution of class samples and the characteristics of ensemble learning algorithm, in this paper, a two-level selective ensemble learning algorithm for imbalanced datasets is proposed. Firstly, the algorithm under-samples the majority class samples and constructs multiple training subsets. The training process will generate multiple base classifiers using AdaBoost algorithm, then select some base classifiers according to maximum correlation and minimum redundancy criteria, and form sub-classifiers according to weighted integration. Then, generate multiple sub-classifiers for multiple training subsets, and then, select some sub-classifiers according to maximum correlation and minimum redundancy criteria. Then, the weights of the selected sub-classifiers are calculated by F-means or G-means, and the ensemble classifier is obtained by weighted voting. Finally, the improved algorithm for imbalanced dataset is applied to the network anomaly detection. The experimental results on UCI datasets show that this method can improve the classification performance to a certain extent, especially for imbalanced datasets. Finally, the algorithm is applied to network anomaly detection for Internet of Things. From the simulation data of KDDCUP99 dataset, we can see that TLSE-ID algorithm has a small missing report rate and high precision.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11227-020-03374-z</doi><tpages>22</tpages><orcidid>https://orcid.org/0000-0002-6417-3600</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 0920-8542
ispartof	The Journal of supercomputing, 2021-03, Vol.77 (3), p.2875-2896
issn	0920-8542 1573-0484
language	eng
recordid	cdi_proquest_journals_2489026891
source	SpringerLink Journals - AutoHoldings
subjects	Algorithms Anomalies Classifiers Compilers Computer Science Datasets Deep Learning in IoT: Emerging Trends and Applications - 2019 Internet of Things Interpreters Machine learning Processor Architectures Programming Languages Redundancy Resampling Training
title	Network anomaly detection based on selective ensemble algorithm
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T20%3A45%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Network%20anomaly%20detection%20based%20on%20selective%20ensemble%20algorithm&rft.jtitle=The%20Journal%20of%20supercomputing&rft.au=Du,%20Hongle&rft.date=2021-03-01&rft.volume=77&rft.issue=3&rft.spage=2875&rft.epage=2896&rft.pages=2875-2896&rft.issn=0920-8542&rft.eissn=1573-0484&rft_id=info:doi/10.1007/s11227-020-03374-z&rft_dat=%3Cproquest_cross%3E2489026891%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2489026891&rft_id=info:pmid/&rfr_iscdi=true