Network anomaly detection based on selective ensemble algorithm
In order to reduce the loss of information of the majority class samples in the resampling process, combining the distribution of class samples and the characteristics of ensemble learning algorithm, in this paper, a two-level selective ensemble learning algorithm for imbalanced datasets is proposed...
Gespeichert in:
Veröffentlicht in: | The Journal of supercomputing 2021-03, Vol.77 (3), p.2875-2896 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 2896 |
---|---|
container_issue | 3 |
container_start_page | 2875 |
container_title | The Journal of supercomputing |
container_volume | 77 |
creator | Du, Hongle Zhang, Yan |
description | In order to reduce the loss of information of the majority class samples in the resampling process, combining the distribution of class samples and the characteristics of ensemble learning algorithm, in this paper, a two-level selective ensemble learning algorithm for imbalanced datasets is proposed. Firstly, the algorithm under-samples the majority class samples and constructs multiple training subsets. The training process will generate multiple base classifiers using AdaBoost algorithm, then select some base classifiers according to maximum correlation and minimum redundancy criteria, and form sub-classifiers according to weighted integration. Then, generate multiple sub-classifiers for multiple training subsets, and then, select some sub-classifiers according to maximum correlation and minimum redundancy criteria. Then, the weights of the selected sub-classifiers are calculated by F-means or G-means, and the ensemble classifier is obtained by weighted voting. Finally, the improved algorithm for imbalanced dataset is applied to the network anomaly detection. The experimental results on UCI datasets show that this method can improve the classification performance to a certain extent, especially for imbalanced datasets. Finally, the algorithm is applied to network anomaly detection for Internet of Things. From the simulation data of KDDCUP99 dataset, we can see that TLSE-ID algorithm has a small missing report rate and high precision. |
doi_str_mv | 10.1007/s11227-020-03374-z |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2489026891</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2489026891</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-df8af42f479e53a286e0fbf438854004271b15af12e85b722124606f1a87f48a3</originalsourceid><addsrcrecordid>eNp9kE1LAzEQhoMoWKt_wNOC5-hkkt1kTyLFLyh60XPItpPauh812Srtrzd1BW-eZph535mXh7FzAZcCQF9FIRA1BwQOUmrFdwdsJHItOSijDtkIyrQyucJjdhLjCgCU1HLErp-o_-rCe-barnH1NptTT7N-2bVZ5SLNs9REqvejT8qojdRUNWWuXnRh2b81p-zIuzrS2W8ds9e725fJA58-3z9ObqZ8JkXZ87k3ziv0SpeUS4emIPCVV9KkTCkLalGJ3HmBZPJKIwpUBRReOKO9Mk6O2cVwdx26jw3F3q66TWjTS4vKlICFKUVS4aCahS7GQN6uw7JxYWsF2D0oO4CyCZT9AWV3ySQHU0zidkHh7_Q_rm-ygWsG</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2489026891</pqid></control><display><type>article</type><title>Network anomaly detection based on selective ensemble algorithm</title><source>SpringerLink Journals - AutoHoldings</source><creator>Du, Hongle ; Zhang, Yan</creator><creatorcontrib>Du, Hongle ; Zhang, Yan</creatorcontrib><description>In order to reduce the loss of information of the majority class samples in the resampling process, combining the distribution of class samples and the characteristics of ensemble learning algorithm, in this paper, a two-level selective ensemble learning algorithm for imbalanced datasets is proposed. Firstly, the algorithm under-samples the majority class samples and constructs multiple training subsets. The training process will generate multiple base classifiers using AdaBoost algorithm, then select some base classifiers according to maximum correlation and minimum redundancy criteria, and form sub-classifiers according to weighted integration. Then, generate multiple sub-classifiers for multiple training subsets, and then, select some sub-classifiers according to maximum correlation and minimum redundancy criteria. Then, the weights of the selected sub-classifiers are calculated by F-means or G-means, and the ensemble classifier is obtained by weighted voting. Finally, the improved algorithm for imbalanced dataset is applied to the network anomaly detection. The experimental results on UCI datasets show that this method can improve the classification performance to a certain extent, especially for imbalanced datasets. Finally, the algorithm is applied to network anomaly detection for Internet of Things. From the simulation data of KDDCUP99 dataset, we can see that TLSE-ID algorithm has a small missing report rate and high precision.</description><identifier>ISSN: 0920-8542</identifier><identifier>EISSN: 1573-0484</identifier><identifier>DOI: 10.1007/s11227-020-03374-z</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Algorithms ; Anomalies ; Classifiers ; Compilers ; Computer Science ; Datasets ; Deep Learning in IoT: Emerging Trends and Applications - 2019 ; Internet of Things ; Interpreters ; Machine learning ; Processor Architectures ; Programming Languages ; Redundancy ; Resampling ; Training</subject><ispartof>The Journal of supercomputing, 2021-03, Vol.77 (3), p.2875-2896</ispartof><rights>Springer Science+Business Media, LLC, part of Springer Nature 2020</rights><rights>Springer Science+Business Media, LLC, part of Springer Nature 2020.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-df8af42f479e53a286e0fbf438854004271b15af12e85b722124606f1a87f48a3</citedby><cites>FETCH-LOGICAL-c319t-df8af42f479e53a286e0fbf438854004271b15af12e85b722124606f1a87f48a3</cites><orcidid>0000-0002-6417-3600</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11227-020-03374-z$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11227-020-03374-z$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Du, Hongle</creatorcontrib><creatorcontrib>Zhang, Yan</creatorcontrib><title>Network anomaly detection based on selective ensemble algorithm</title><title>The Journal of supercomputing</title><addtitle>J Supercomput</addtitle><description>In order to reduce the loss of information of the majority class samples in the resampling process, combining the distribution of class samples and the characteristics of ensemble learning algorithm, in this paper, a two-level selective ensemble learning algorithm for imbalanced datasets is proposed. Firstly, the algorithm under-samples the majority class samples and constructs multiple training subsets. The training process will generate multiple base classifiers using AdaBoost algorithm, then select some base classifiers according to maximum correlation and minimum redundancy criteria, and form sub-classifiers according to weighted integration. Then, generate multiple sub-classifiers for multiple training subsets, and then, select some sub-classifiers according to maximum correlation and minimum redundancy criteria. Then, the weights of the selected sub-classifiers are calculated by F-means or G-means, and the ensemble classifier is obtained by weighted voting. Finally, the improved algorithm for imbalanced dataset is applied to the network anomaly detection. The experimental results on UCI datasets show that this method can improve the classification performance to a certain extent, especially for imbalanced datasets. Finally, the algorithm is applied to network anomaly detection for Internet of Things. From the simulation data of KDDCUP99 dataset, we can see that TLSE-ID algorithm has a small missing report rate and high precision.</description><subject>Algorithms</subject><subject>Anomalies</subject><subject>Classifiers</subject><subject>Compilers</subject><subject>Computer Science</subject><subject>Datasets</subject><subject>Deep Learning in IoT: Emerging Trends and Applications - 2019</subject><subject>Internet of Things</subject><subject>Interpreters</subject><subject>Machine learning</subject><subject>Processor Architectures</subject><subject>Programming Languages</subject><subject>Redundancy</subject><subject>Resampling</subject><subject>Training</subject><issn>0920-8542</issn><issn>1573-0484</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNp9kE1LAzEQhoMoWKt_wNOC5-hkkt1kTyLFLyh60XPItpPauh812Srtrzd1BW-eZph535mXh7FzAZcCQF9FIRA1BwQOUmrFdwdsJHItOSijDtkIyrQyucJjdhLjCgCU1HLErp-o_-rCe-barnH1NptTT7N-2bVZ5SLNs9REqvejT8qojdRUNWWuXnRh2b81p-zIuzrS2W8ds9e725fJA58-3z9ObqZ8JkXZ87k3ziv0SpeUS4emIPCVV9KkTCkLalGJ3HmBZPJKIwpUBRReOKO9Mk6O2cVwdx26jw3F3q66TWjTS4vKlICFKUVS4aCahS7GQN6uw7JxYWsF2D0oO4CyCZT9AWV3ySQHU0zidkHh7_Q_rm-ygWsG</recordid><startdate>20210301</startdate><enddate>20210301</enddate><creator>Du, Hongle</creator><creator>Zhang, Yan</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-6417-3600</orcidid></search><sort><creationdate>20210301</creationdate><title>Network anomaly detection based on selective ensemble algorithm</title><author>Du, Hongle ; Zhang, Yan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-df8af42f479e53a286e0fbf438854004271b15af12e85b722124606f1a87f48a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Anomalies</topic><topic>Classifiers</topic><topic>Compilers</topic><topic>Computer Science</topic><topic>Datasets</topic><topic>Deep Learning in IoT: Emerging Trends and Applications - 2019</topic><topic>Internet of Things</topic><topic>Interpreters</topic><topic>Machine learning</topic><topic>Processor Architectures</topic><topic>Programming Languages</topic><topic>Redundancy</topic><topic>Resampling</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Du, Hongle</creatorcontrib><creatorcontrib>Zhang, Yan</creatorcontrib><collection>CrossRef</collection><jtitle>The Journal of supercomputing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Du, Hongle</au><au>Zhang, Yan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Network anomaly detection based on selective ensemble algorithm</atitle><jtitle>The Journal of supercomputing</jtitle><stitle>J Supercomput</stitle><date>2021-03-01</date><risdate>2021</risdate><volume>77</volume><issue>3</issue><spage>2875</spage><epage>2896</epage><pages>2875-2896</pages><issn>0920-8542</issn><eissn>1573-0484</eissn><abstract>In order to reduce the loss of information of the majority class samples in the resampling process, combining the distribution of class samples and the characteristics of ensemble learning algorithm, in this paper, a two-level selective ensemble learning algorithm for imbalanced datasets is proposed. Firstly, the algorithm under-samples the majority class samples and constructs multiple training subsets. The training process will generate multiple base classifiers using AdaBoost algorithm, then select some base classifiers according to maximum correlation and minimum redundancy criteria, and form sub-classifiers according to weighted integration. Then, generate multiple sub-classifiers for multiple training subsets, and then, select some sub-classifiers according to maximum correlation and minimum redundancy criteria. Then, the weights of the selected sub-classifiers are calculated by F-means or G-means, and the ensemble classifier is obtained by weighted voting. Finally, the improved algorithm for imbalanced dataset is applied to the network anomaly detection. The experimental results on UCI datasets show that this method can improve the classification performance to a certain extent, especially for imbalanced datasets. Finally, the algorithm is applied to network anomaly detection for Internet of Things. From the simulation data of KDDCUP99 dataset, we can see that TLSE-ID algorithm has a small missing report rate and high precision.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11227-020-03374-z</doi><tpages>22</tpages><orcidid>https://orcid.org/0000-0002-6417-3600</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0920-8542 |
ispartof | The Journal of supercomputing, 2021-03, Vol.77 (3), p.2875-2896 |
issn | 0920-8542 1573-0484 |
language | eng |
recordid | cdi_proquest_journals_2489026891 |
source | SpringerLink Journals - AutoHoldings |
subjects | Algorithms Anomalies Classifiers Compilers Computer Science Datasets Deep Learning in IoT: Emerging Trends and Applications - 2019 Internet of Things Interpreters Machine learning Processor Architectures Programming Languages Redundancy Resampling Training |
title | Network anomaly detection based on selective ensemble algorithm |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T20%3A45%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Network%20anomaly%20detection%20based%20on%20selective%20ensemble%20algorithm&rft.jtitle=The%20Journal%20of%20supercomputing&rft.au=Du,%20Hongle&rft.date=2021-03-01&rft.volume=77&rft.issue=3&rft.spage=2875&rft.epage=2896&rft.pages=2875-2896&rft.issn=0920-8542&rft.eissn=1573-0484&rft_id=info:doi/10.1007/s11227-020-03374-z&rft_dat=%3Cproquest_cross%3E2489026891%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2489026891&rft_id=info:pmid/&rfr_iscdi=true |