Auto-detection of sophisticated malware using lazy-binding control flow graph and deep learning

To date, industrial antivirus tools are mostly using signature-based methods to detect malware occurrences. However, sophisticated malware, such as metamorphic or polymorphic virus, can effectively evade those tools by using some advanced obfuscation techniques, including mutation and the dynamicall...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Computers & security 2018-07, Vol.76, p.128-155
Hauptverfasser:	Nguyen, Minh Hai, Nguyen, Dung Le, Nguyen, Xuan Mao, Quan, Tho Thanh
Format:	Artikel
Sprache:	eng
Schlagworte:	Anti-virus software Binary-based control Computer programming Computer viruses Deep learning Dynamically executed contents Flow graph Graphical representations Image classification Lazy-binding CFG Learning Malware Metamorphic virus Mutation Packing techniques Polymorphic virus Program verification (computers) Software Studies
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	155
container_issue
container_start_page	128
container_title	Computers & security
container_volume	76
creator	Nguyen, Minh Hai Nguyen, Dung Le Nguyen, Xuan Mao Quan, Tho Thanh
description	To date, industrial antivirus tools are mostly using signature-based methods to detect malware occurrences. However, sophisticated malware, such as metamorphic or polymorphic virus, can effectively evade those tools by using some advanced obfuscation techniques, including mutation and the dynamically executed contents (DEC) methods, which dynamically produce new executable code in the run-time. Common DEC methods used by malware programs are packing or calling external code. In the research community, the approach of program analysis to detect suspicious behaviors has been emerging recently to handle this problem. Control flow graph (CFG) is a suitable representation to capture common behaviors from various mutated samples of virus. However, the current typical CFG forms generated by state-of-the-art binary analysis tools, such as IDA Pro, do not precisely reflect the behaviors of DEC methods. Moreover, this approach suffers from an extremely heavy cost to conduct and analyze the CFGs from binaries. This drawback causes the method of formal behavior analysis to be virtually not applicable with real-world applications. In this paper, we propose an enhanced form of CFG, known as lazy-binding CFG to reflect the DEC behaviors. Then, with the recent advancement of the deep learning techniques, we present a method of producing image-based representation from the generated CFG. As deep learning is very popular to perform image classification on very large dataset, our proposed technique can be applied for malware detection on real-world computer programs and thus enjoying very high accuracy. We also illustrate our analysis results with some well-known malware samples, including WannaCry, Kasperagent and Sality, one of the most sophisticated polymorphic viruses.
doi_str_mv	10.1016/j.cose.2018.02.006
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2094500348</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0167404818300889</els_id><sourcerecordid>2094500348</sourcerecordid><originalsourceid>FETCH-LOGICAL-c328t-a43d0a884da89c47e9db33136fb8cadc651b01220bb6c00f050b2537d157ab783</originalsourceid><addsrcrecordid>eNp9kMtqwzAQRUVpoenjB7oSdG13JPmhQDch9AWBbtq10MuJjGO5ktyQfn1t0nVXw8C5d4aD0B2BnACpHtpc-2hzCoTnQHOA6gwtCK9pVlHg52gxQXVWQMEv0VWMLQCpK84XSKzG5DNjk9XJ-R77Bkc_7FxMTstkDd7L7iCDxWN0_RZ38ueYKdebedG-T8F3uOn8AW-DHHZY9gYbawfcWRn6CbpBF43sor39m9fo8_npY_2abd5f3tarTaYZ5SmTBTMgOS-M5Etd1HZpFGOEVY3iWhpdlUQBoRSUqjRAAyUoWrLakLKWqubsGt2feofgv0Ybk2j9GPrppKCwLEoAVswUPVE6-BiDbcQQ3F6GoyAgZpGiFbNIMYsUQMUkcgo9nkJ2-v_b2SCidrbX1rgwWRPGu__iv7EsfQg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2094500348</pqid></control><display><type>article</type><title>Auto-detection of sophisticated malware using lazy-binding control flow graph and deep learning</title><source>Elsevier ScienceDirect Journals</source><creator>Nguyen, Minh Hai ; Nguyen, Dung Le ; Nguyen, Xuan Mao ; Quan, Tho Thanh</creator><creatorcontrib>Nguyen, Minh Hai ; Nguyen, Dung Le ; Nguyen, Xuan Mao ; Quan, Tho Thanh</creatorcontrib><description>To date, industrial antivirus tools are mostly using signature-based methods to detect malware occurrences. However, sophisticated malware, such as metamorphic or polymorphic virus, can effectively evade those tools by using some advanced obfuscation techniques, including mutation and the dynamically executed contents (DEC) methods, which dynamically produce new executable code in the run-time. Common DEC methods used by malware programs are packing or calling external code. In the research community, the approach of program analysis to detect suspicious behaviors has been emerging recently to handle this problem. Control flow graph (CFG) is a suitable representation to capture common behaviors from various mutated samples of virus. However, the current typical CFG forms generated by state-of-the-art binary analysis tools, such as IDA Pro, do not precisely reflect the behaviors of DEC methods. Moreover, this approach suffers from an extremely heavy cost to conduct and analyze the CFGs from binaries. This drawback causes the method of formal behavior analysis to be virtually not applicable with real-world applications. In this paper, we propose an enhanced form of CFG, known as lazy-binding CFG to reflect the DEC behaviors. Then, with the recent advancement of the deep learning techniques, we present a method of producing image-based representation from the generated CFG. As deep learning is very popular to perform image classification on very large dataset, our proposed technique can be applied for malware detection on real-world computer programs and thus enjoying very high accuracy. We also illustrate our analysis results with some well-known malware samples, including WannaCry, Kasperagent and Sality, one of the most sophisticated polymorphic viruses.</description><identifier>ISSN: 0167-4048</identifier><identifier>EISSN: 1872-6208</identifier><identifier>DOI: 10.1016/j.cose.2018.02.006</identifier><language>eng</language><publisher>Amsterdam: Elsevier Ltd</publisher><subject>Anti-virus software ; Binary-based control ; Computer programming ; Computer viruses ; Deep learning ; Dynamically executed contents ; Flow graph ; Graphical representations ; Image classification ; Lazy-binding CFG ; Learning ; Malware ; Metamorphic virus ; Mutation ; Packing techniques ; Polymorphic virus ; Program verification (computers) ; Software ; Studies</subject><ispartof>Computers & security, 2018-07, Vol.76, p.128-155</ispartof><rights>2018 Elsevier Ltd</rights><rights>Copyright Elsevier Sequoia S.A. Jul 2018</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c328t-a43d0a884da89c47e9db33136fb8cadc651b01220bb6c00f050b2537d157ab783</citedby><cites>FETCH-LOGICAL-c328t-a43d0a884da89c47e9db33136fb8cadc651b01220bb6c00f050b2537d157ab783</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S0167404818300889$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3537,27901,27902,65534</link.rule.ids></links><search><creatorcontrib>Nguyen, Minh Hai</creatorcontrib><creatorcontrib>Nguyen, Dung Le</creatorcontrib><creatorcontrib>Nguyen, Xuan Mao</creatorcontrib><creatorcontrib>Quan, Tho Thanh</creatorcontrib><title>Auto-detection of sophisticated malware using lazy-binding control flow graph and deep learning</title><title>Computers & security</title><description>To date, industrial antivirus tools are mostly using signature-based methods to detect malware occurrences. However, sophisticated malware, such as metamorphic or polymorphic virus, can effectively evade those tools by using some advanced obfuscation techniques, including mutation and the dynamically executed contents (DEC) methods, which dynamically produce new executable code in the run-time. Common DEC methods used by malware programs are packing or calling external code. In the research community, the approach of program analysis to detect suspicious behaviors has been emerging recently to handle this problem. Control flow graph (CFG) is a suitable representation to capture common behaviors from various mutated samples of virus. However, the current typical CFG forms generated by state-of-the-art binary analysis tools, such as IDA Pro, do not precisely reflect the behaviors of DEC methods. Moreover, this approach suffers from an extremely heavy cost to conduct and analyze the CFGs from binaries. This drawback causes the method of formal behavior analysis to be virtually not applicable with real-world applications. In this paper, we propose an enhanced form of CFG, known as lazy-binding CFG to reflect the DEC behaviors. Then, with the recent advancement of the deep learning techniques, we present a method of producing image-based representation from the generated CFG. As deep learning is very popular to perform image classification on very large dataset, our proposed technique can be applied for malware detection on real-world computer programs and thus enjoying very high accuracy. We also illustrate our analysis results with some well-known malware samples, including WannaCry, Kasperagent and Sality, one of the most sophisticated polymorphic viruses.</description><subject>Anti-virus software</subject><subject>Binary-based control</subject><subject>Computer programming</subject><subject>Computer viruses</subject><subject>Deep learning</subject><subject>Dynamically executed contents</subject><subject>Flow graph</subject><subject>Graphical representations</subject><subject>Image classification</subject><subject>Lazy-binding CFG</subject><subject>Learning</subject><subject>Malware</subject><subject>Metamorphic virus</subject><subject>Mutation</subject><subject>Packing techniques</subject><subject>Polymorphic virus</subject><subject>Program verification (computers)</subject><subject>Software</subject><subject>Studies</subject><issn>0167-4048</issn><issn>1872-6208</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNp9kMtqwzAQRUVpoenjB7oSdG13JPmhQDch9AWBbtq10MuJjGO5ktyQfn1t0nVXw8C5d4aD0B2BnACpHtpc-2hzCoTnQHOA6gwtCK9pVlHg52gxQXVWQMEv0VWMLQCpK84XSKzG5DNjk9XJ-R77Bkc_7FxMTstkDd7L7iCDxWN0_RZ38ueYKdebedG-T8F3uOn8AW-DHHZY9gYbawfcWRn6CbpBF43sor39m9fo8_npY_2abd5f3tarTaYZ5SmTBTMgOS-M5Etd1HZpFGOEVY3iWhpdlUQBoRSUqjRAAyUoWrLakLKWqubsGt2feofgv0Ybk2j9GPrppKCwLEoAVswUPVE6-BiDbcQQ3F6GoyAgZpGiFbNIMYsUQMUkcgo9nkJ2-v_b2SCidrbX1rgwWRPGu__iv7EsfQg</recordid><startdate>201807</startdate><enddate>201807</enddate><creator>Nguyen, Minh Hai</creator><creator>Nguyen, Dung Le</creator><creator>Nguyen, Xuan Mao</creator><creator>Quan, Tho Thanh</creator><general>Elsevier Ltd</general><general>Elsevier Sequoia S.A</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>K7.</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>201807</creationdate><title>Auto-detection of sophisticated malware using lazy-binding control flow graph and deep learning</title><author>Nguyen, Minh Hai ; Nguyen, Dung Le ; Nguyen, Xuan Mao ; Quan, Tho Thanh</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c328t-a43d0a884da89c47e9db33136fb8cadc651b01220bb6c00f050b2537d157ab783</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Anti-virus software</topic><topic>Binary-based control</topic><topic>Computer programming</topic><topic>Computer viruses</topic><topic>Deep learning</topic><topic>Dynamically executed contents</topic><topic>Flow graph</topic><topic>Graphical representations</topic><topic>Image classification</topic><topic>Lazy-binding CFG</topic><topic>Learning</topic><topic>Malware</topic><topic>Metamorphic virus</topic><topic>Mutation</topic><topic>Packing techniques</topic><topic>Polymorphic virus</topic><topic>Program verification (computers)</topic><topic>Software</topic><topic>Studies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Nguyen, Minh Hai</creatorcontrib><creatorcontrib>Nguyen, Dung Le</creatorcontrib><creatorcontrib>Nguyen, Xuan Mao</creatorcontrib><creatorcontrib>Quan, Tho Thanh</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Criminal Justice (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Computers & security</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Nguyen, Minh Hai</au><au>Nguyen, Dung Le</au><au>Nguyen, Xuan Mao</au><au>Quan, Tho Thanh</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Auto-detection of sophisticated malware using lazy-binding control flow graph and deep learning</atitle><jtitle>Computers & security</jtitle><date>2018-07</date><risdate>2018</risdate><volume>76</volume><spage>128</spage><epage>155</epage><pages>128-155</pages><issn>0167-4048</issn><eissn>1872-6208</eissn><abstract>To date, industrial antivirus tools are mostly using signature-based methods to detect malware occurrences. However, sophisticated malware, such as metamorphic or polymorphic virus, can effectively evade those tools by using some advanced obfuscation techniques, including mutation and the dynamically executed contents (DEC) methods, which dynamically produce new executable code in the run-time. Common DEC methods used by malware programs are packing or calling external code. In the research community, the approach of program analysis to detect suspicious behaviors has been emerging recently to handle this problem. Control flow graph (CFG) is a suitable representation to capture common behaviors from various mutated samples of virus. However, the current typical CFG forms generated by state-of-the-art binary analysis tools, such as IDA Pro, do not precisely reflect the behaviors of DEC methods. Moreover, this approach suffers from an extremely heavy cost to conduct and analyze the CFGs from binaries. This drawback causes the method of formal behavior analysis to be virtually not applicable with real-world applications. In this paper, we propose an enhanced form of CFG, known as lazy-binding CFG to reflect the DEC behaviors. Then, with the recent advancement of the deep learning techniques, we present a method of producing image-based representation from the generated CFG. As deep learning is very popular to perform image classification on very large dataset, our proposed technique can be applied for malware detection on real-world computer programs and thus enjoying very high accuracy. We also illustrate our analysis results with some well-known malware samples, including WannaCry, Kasperagent and Sality, one of the most sophisticated polymorphic viruses.</abstract><cop>Amsterdam</cop><pub>Elsevier Ltd</pub><doi>10.1016/j.cose.2018.02.006</doi><tpages>28</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0167-4048
ispartof	Computers & security, 2018-07, Vol.76, p.128-155
issn	0167-4048 1872-6208
language	eng
recordid	cdi_proquest_journals_2094500348
source	Elsevier ScienceDirect Journals
subjects	Anti-virus software Binary-based control Computer programming Computer viruses Deep learning Dynamically executed contents Flow graph Graphical representations Image classification Lazy-binding CFG Learning Malware Metamorphic virus Mutation Packing techniques Polymorphic virus Program verification (computers) Software Studies
title	Auto-detection of sophisticated malware using lazy-binding control flow graph and deep learning
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-19T11%3A49%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Auto-detection%20of%20sophisticated%20malware%20using%20lazy-binding%20control%20flow%20graph%20and%20deep%20learning&rft.jtitle=Computers%20&%20security&rft.au=Nguyen,%20Minh%20Hai&rft.date=2018-07&rft.volume=76&rft.spage=128&rft.epage=155&rft.pages=128-155&rft.issn=0167-4048&rft.eissn=1872-6208&rft_id=info:doi/10.1016/j.cose.2018.02.006&rft_dat=%3Cproquest_cross%3E2094500348%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2094500348&rft_id=info:pmid/&rft_els_id=S0167404818300889&rfr_iscdi=true