An empirical study to estimate the stability of random forest classifier on the hybrid features recommended by filter based feature selection technique

The emergence of advanced malware is a serious threat to information security. A prominent technique that identifies sophisticated malware should consider the runtime behaviour of the source file to detect malicious intent. Although the behaviour-based malware detection technique is a substantial im...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	International journal of machine learning and cybernetics 2020-02, Vol.11 (2), p.339-358
Hauptverfasser:	Darshan, S. L. Shiva, Jaidhar, C. D.
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Application programming interface Artificial Intelligence Classifiers Complex Systems Computational Intelligence Control Cybersecurity Datasets Decision analysis Decision trees Disk operating systems Dynamic link libraries Empirical analysis Engineering Hybrid systems Machine learning Malware Mechatronics Original Article Pattern Recognition Robotics Systems Biology
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	358
container_issue	2
container_start_page	339
container_title	International journal of machine learning and cybernetics
container_volume	11
creator	Darshan, S. L. Shiva Jaidhar, C. D.
description	The emergence of advanced malware is a serious threat to information security. A prominent technique that identifies sophisticated malware should consider the runtime behaviour of the source file to detect malicious intent. Although the behaviour-based malware detection technique is a substantial improvement over the traditional signature-based detection technique, current malware employs code obfuscation techniques to elude detection. This paper presents the Hybrid Features-based malware detection system (HFMDS) that integrates static and dynamic features of the portable executable (PE) files to discern malware. The HFMDS is trained with prominent features advised by the filter-based feature selection technique (FST). The detection ability of the proposed HFMDS has evaluated with the random forest (RF) classifier by considering two different datasets that consist of real-world Windows malware samples. In-depth analysis is carried out to determine the optimal number of decision trees (DTs) required by the RF classifier to achieve consistent accuracy. Besides, four popular FSTs performance is also analyzed to determine which FST recommends the best features. From the experimental analysis, we can infer that increasing the number of DTs after 160 within the RF classifier does not make a significant difference in attaining better detection accuracy.
doi_str_mv	10.1007/s13042-019-00978-7
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2920625284</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2920625284</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-61b0ed5aff07997361a30377bd7ff49a27f0738022d664469393181f8bba1e83</originalsourceid><addsrcrecordid>eNp9kc1OxCAUhRujiROdF3BF4rp6gVpgOZn4l0ziZhbuCG0vDqYtIzCLPomvK1oz7mQDuXznnOSeoriicEMBxG2kHCpWAlUlgBKyFCfFgspalhLk6-nxLeh5sYzxHfKpgXNgi-JzNRIc9i641vQkpkM3keQJxuQGk5CkHeapaVzv0kS8JcGMnR-I9SEzpO1NjM46DMSPP_BuaoLriEWTDhkhAVs_DDh22JFmItb1KcONiXiESMQe2-S-HbDdje7jgJfFmTV9xOXvfVFsH-6366dy8_L4vF5typZTlcqaNoDdnbEWhFKC19Rw4EI0nbC2UoaJ_MElMNbVdVXViitOJbWyaQxFyS-K69l2H3xOjUm_-0MYc6JmikHN7pisMsVmqg0-xoBW70NeT5g0Bf1dgZ4r0LkC_VOBFlnEZ1HM8PiG4c_6H9UXi7aL8A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2920625284</pqid></control><display><type>article</type><title>An empirical study to estimate the stability of random forest classifier on the hybrid features recommended by filter based feature selection technique</title><source>SpringerNature Journals</source><source>ProQuest Central UK/Ireland</source><source>ProQuest Central</source><creator>Darshan, S. L. Shiva ; Jaidhar, C. D.</creator><creatorcontrib>Darshan, S. L. Shiva ; Jaidhar, C. D.</creatorcontrib><description>The emergence of advanced malware is a serious threat to information security. A prominent technique that identifies sophisticated malware should consider the runtime behaviour of the source file to detect malicious intent. Although the behaviour-based malware detection technique is a substantial improvement over the traditional signature-based detection technique, current malware employs code obfuscation techniques to elude detection. This paper presents the Hybrid Features-based malware detection system (HFMDS) that integrates static and dynamic features of the portable executable (PE) files to discern malware. The HFMDS is trained with prominent features advised by the filter-based feature selection technique (FST). The detection ability of the proposed HFMDS has evaluated with the random forest (RF) classifier by considering two different datasets that consist of real-world Windows malware samples. In-depth analysis is carried out to determine the optimal number of decision trees (DTs) required by the RF classifier to achieve consistent accuracy. Besides, four popular FSTs performance is also analyzed to determine which FST recommends the best features. From the experimental analysis, we can infer that increasing the number of DTs after 160 within the RF classifier does not make a significant difference in attaining better detection accuracy.</description><identifier>ISSN: 1868-8071</identifier><identifier>EISSN: 1868-808X</identifier><identifier>DOI: 10.1007/s13042-019-00978-7</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Accuracy ; Application programming interface ; Artificial Intelligence ; Classifiers ; Complex Systems ; Computational Intelligence ; Control ; Cybersecurity ; Datasets ; Decision analysis ; Decision trees ; Disk operating systems ; Dynamic link libraries ; Empirical analysis ; Engineering ; Hybrid systems ; Machine learning ; Malware ; Mechatronics ; Original Article ; Pattern Recognition ; Robotics ; Systems Biology</subject><ispartof>International journal of machine learning and cybernetics, 2020-02, Vol.11 (2), p.339-358</ispartof><rights>Springer-Verlag GmbH Germany, part of Springer Nature 2019</rights><rights>Springer-Verlag GmbH Germany, part of Springer Nature 2019.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-61b0ed5aff07997361a30377bd7ff49a27f0738022d664469393181f8bba1e83</citedby><cites>FETCH-LOGICAL-c319t-61b0ed5aff07997361a30377bd7ff49a27f0738022d664469393181f8bba1e83</cites><orcidid>0000-0001-9556-1342</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s13042-019-00978-7$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2920625284?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>314,780,784,21388,27924,27925,33744,41488,42557,43805,51319,64385,64389,72469</link.rule.ids></links><search><creatorcontrib>Darshan, S. L. Shiva</creatorcontrib><creatorcontrib>Jaidhar, C. D.</creatorcontrib><title>An empirical study to estimate the stability of random forest classifier on the hybrid features recommended by filter based feature selection technique</title><title>International journal of machine learning and cybernetics</title><addtitle>Int. J. Mach. Learn. & Cyber</addtitle><description>The emergence of advanced malware is a serious threat to information security. A prominent technique that identifies sophisticated malware should consider the runtime behaviour of the source file to detect malicious intent. Although the behaviour-based malware detection technique is a substantial improvement over the traditional signature-based detection technique, current malware employs code obfuscation techniques to elude detection. This paper presents the Hybrid Features-based malware detection system (HFMDS) that integrates static and dynamic features of the portable executable (PE) files to discern malware. The HFMDS is trained with prominent features advised by the filter-based feature selection technique (FST). The detection ability of the proposed HFMDS has evaluated with the random forest (RF) classifier by considering two different datasets that consist of real-world Windows malware samples. In-depth analysis is carried out to determine the optimal number of decision trees (DTs) required by the RF classifier to achieve consistent accuracy. Besides, four popular FSTs performance is also analyzed to determine which FST recommends the best features. From the experimental analysis, we can infer that increasing the number of DTs after 160 within the RF classifier does not make a significant difference in attaining better detection accuracy.</description><subject>Accuracy</subject><subject>Application programming interface</subject><subject>Artificial Intelligence</subject><subject>Classifiers</subject><subject>Complex Systems</subject><subject>Computational Intelligence</subject><subject>Control</subject><subject>Cybersecurity</subject><subject>Datasets</subject><subject>Decision analysis</subject><subject>Decision trees</subject><subject>Disk operating systems</subject><subject>Dynamic link libraries</subject><subject>Empirical analysis</subject><subject>Engineering</subject><subject>Hybrid systems</subject><subject>Machine learning</subject><subject>Malware</subject><subject>Mechatronics</subject><subject>Original Article</subject><subject>Pattern Recognition</subject><subject>Robotics</subject><subject>Systems Biology</subject><issn>1868-8071</issn><issn>1868-808X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNp9kc1OxCAUhRujiROdF3BF4rp6gVpgOZn4l0ziZhbuCG0vDqYtIzCLPomvK1oz7mQDuXznnOSeoriicEMBxG2kHCpWAlUlgBKyFCfFgspalhLk6-nxLeh5sYzxHfKpgXNgi-JzNRIc9i641vQkpkM3keQJxuQGk5CkHeapaVzv0kS8JcGMnR-I9SEzpO1NjM46DMSPP_BuaoLriEWTDhkhAVs_DDh22JFmItb1KcONiXiESMQe2-S-HbDdje7jgJfFmTV9xOXvfVFsH-6366dy8_L4vF5typZTlcqaNoDdnbEWhFKC19Rw4EI0nbC2UoaJ_MElMNbVdVXViitOJbWyaQxFyS-K69l2H3xOjUm_-0MYc6JmikHN7pisMsVmqg0-xoBW70NeT5g0Bf1dgZ4r0LkC_VOBFlnEZ1HM8PiG4c_6H9UXi7aL8A</recordid><startdate>20200201</startdate><enddate>20200201</enddate><creator>Darshan, S. L. Shiva</creator><creator>Jaidhar, C. D.</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>L6V</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PTHSS</scope><orcidid>https://orcid.org/0000-0001-9556-1342</orcidid></search><sort><creationdate>20200201</creationdate><title>An empirical study to estimate the stability of random forest classifier on the hybrid features recommended by filter based feature selection technique</title><author>Darshan, S. L. Shiva ; Jaidhar, C. D.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-61b0ed5aff07997361a30377bd7ff49a27f0738022d664469393181f8bba1e83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Accuracy</topic><topic>Application programming interface</topic><topic>Artificial Intelligence</topic><topic>Classifiers</topic><topic>Complex Systems</topic><topic>Computational Intelligence</topic><topic>Control</topic><topic>Cybersecurity</topic><topic>Datasets</topic><topic>Decision analysis</topic><topic>Decision trees</topic><topic>Disk operating systems</topic><topic>Dynamic link libraries</topic><topic>Empirical analysis</topic><topic>Engineering</topic><topic>Hybrid systems</topic><topic>Machine learning</topic><topic>Malware</topic><topic>Mechatronics</topic><topic>Original Article</topic><topic>Pattern Recognition</topic><topic>Robotics</topic><topic>Systems Biology</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Darshan, S. L. Shiva</creatorcontrib><creatorcontrib>Jaidhar, C. D.</creatorcontrib><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>Engineering Collection</collection><jtitle>International journal of machine learning and cybernetics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Darshan, S. L. Shiva</au><au>Jaidhar, C. D.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An empirical study to estimate the stability of random forest classifier on the hybrid features recommended by filter based feature selection technique</atitle><jtitle>International journal of machine learning and cybernetics</jtitle><stitle>Int. J. Mach. Learn. & Cyber</stitle><date>2020-02-01</date><risdate>2020</risdate><volume>11</volume><issue>2</issue><spage>339</spage><epage>358</epage><pages>339-358</pages><issn>1868-8071</issn><eissn>1868-808X</eissn><abstract>The emergence of advanced malware is a serious threat to information security. A prominent technique that identifies sophisticated malware should consider the runtime behaviour of the source file to detect malicious intent. Although the behaviour-based malware detection technique is a substantial improvement over the traditional signature-based detection technique, current malware employs code obfuscation techniques to elude detection. This paper presents the Hybrid Features-based malware detection system (HFMDS) that integrates static and dynamic features of the portable executable (PE) files to discern malware. The HFMDS is trained with prominent features advised by the filter-based feature selection technique (FST). The detection ability of the proposed HFMDS has evaluated with the random forest (RF) classifier by considering two different datasets that consist of real-world Windows malware samples. In-depth analysis is carried out to determine the optimal number of decision trees (DTs) required by the RF classifier to achieve consistent accuracy. Besides, four popular FSTs performance is also analyzed to determine which FST recommends the best features. From the experimental analysis, we can infer that increasing the number of DTs after 160 within the RF classifier does not make a significant difference in attaining better detection accuracy.</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s13042-019-00978-7</doi><tpages>20</tpages><orcidid>https://orcid.org/0000-0001-9556-1342</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 1868-8071
ispartof	International journal of machine learning and cybernetics, 2020-02, Vol.11 (2), p.339-358
issn	1868-8071 1868-808X
language	eng
recordid	cdi_proquest_journals_2920625284
source	SpringerNature Journals; ProQuest Central UK/Ireland; ProQuest Central
subjects	Accuracy Application programming interface Artificial Intelligence Classifiers Complex Systems Computational Intelligence Control Cybersecurity Datasets Decision analysis Decision trees Disk operating systems Dynamic link libraries Empirical analysis Engineering Hybrid systems Machine learning Malware Mechatronics Original Article Pattern Recognition Robotics Systems Biology
title	An empirical study to estimate the stability of random forest classifier on the hybrid features recommended by filter based feature selection technique
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T07%3A53%3A36IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20empirical%20study%20to%20estimate%20the%20stability%20of%20random%20forest%20classifier%20on%20the%20hybrid%20features%20recommended%20by%20filter%20based%20feature%20selection%20technique&rft.jtitle=International%20journal%20of%20machine%20learning%20and%20cybernetics&rft.au=Darshan,%20S.%20L.%20Shiva&rft.date=2020-02-01&rft.volume=11&rft.issue=2&rft.spage=339&rft.epage=358&rft.pages=339-358&rft.issn=1868-8071&rft.eissn=1868-808X&rft_id=info:doi/10.1007/s13042-019-00978-7&rft_dat=%3Cproquest_cross%3E2920625284%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2920625284&rft_id=info:pmid/&rfr_iscdi=true