Auto-detection of sophisticated malware using lazy-binding control flow graph and deep learning
To date, industrial antivirus tools are mostly using signature-based methods to detect malware occurrences. However, sophisticated malware, such as metamorphic or polymorphic virus, can effectively evade those tools by using some advanced obfuscation techniques, including mutation and the dynamicall...
Gespeichert in:
Veröffentlicht in: | Computers & security 2018-07, Vol.76, p.128-155 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 155 |
---|---|
container_issue | |
container_start_page | 128 |
container_title | Computers & security |
container_volume | 76 |
creator | Nguyen, Minh Hai Nguyen, Dung Le Nguyen, Xuan Mao Quan, Tho Thanh |
description | To date, industrial antivirus tools are mostly using signature-based methods to detect malware occurrences. However, sophisticated malware, such as metamorphic or polymorphic virus, can effectively evade those tools by using some advanced obfuscation techniques, including mutation and the dynamically executed contents (DEC) methods, which dynamically produce new executable code in the run-time. Common DEC methods used by malware programs are packing or calling external code. In the research community, the approach of program analysis to detect suspicious behaviors has been emerging recently to handle this problem. Control flow graph (CFG) is a suitable representation to capture common behaviors from various mutated samples of virus. However, the current typical CFG forms generated by state-of-the-art binary analysis tools, such as IDA Pro, do not precisely reflect the behaviors of DEC methods. Moreover, this approach suffers from an extremely heavy cost to conduct and analyze the CFGs from binaries. This drawback causes the method of formal behavior analysis to be virtually not applicable with real-world applications.
In this paper, we propose an enhanced form of CFG, known as lazy-binding CFG to reflect the DEC behaviors. Then, with the recent advancement of the deep learning techniques, we present a method of producing image-based representation from the generated CFG. As deep learning is very popular to perform image classification on very large dataset, our proposed technique can be applied for malware detection on real-world computer programs and thus enjoying very high accuracy. We also illustrate our analysis results with some well-known malware samples, including WannaCry, Kasperagent and Sality, one of the most sophisticated polymorphic viruses. |
doi_str_mv | 10.1016/j.cose.2018.02.006 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2094500348</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0167404818300889</els_id><sourcerecordid>2094500348</sourcerecordid><originalsourceid>FETCH-LOGICAL-c328t-a43d0a884da89c47e9db33136fb8cadc651b01220bb6c00f050b2537d157ab783</originalsourceid><addsrcrecordid>eNp9kMtqwzAQRUVpoenjB7oSdG13JPmhQDch9AWBbtq10MuJjGO5ktyQfn1t0nVXw8C5d4aD0B2BnACpHtpc-2hzCoTnQHOA6gwtCK9pVlHg52gxQXVWQMEv0VWMLQCpK84XSKzG5DNjk9XJ-R77Bkc_7FxMTstkDd7L7iCDxWN0_RZ38ueYKdebedG-T8F3uOn8AW-DHHZY9gYbawfcWRn6CbpBF43sor39m9fo8_npY_2abd5f3tarTaYZ5SmTBTMgOS-M5Etd1HZpFGOEVY3iWhpdlUQBoRSUqjRAAyUoWrLakLKWqubsGt2feofgv0Ybk2j9GPrppKCwLEoAVswUPVE6-BiDbcQQ3F6GoyAgZpGiFbNIMYsUQMUkcgo9nkJ2-v_b2SCidrbX1rgwWRPGu__iv7EsfQg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2094500348</pqid></control><display><type>article</type><title>Auto-detection of sophisticated malware using lazy-binding control flow graph and deep learning</title><source>Elsevier ScienceDirect Journals</source><creator>Nguyen, Minh Hai ; Nguyen, Dung Le ; Nguyen, Xuan Mao ; Quan, Tho Thanh</creator><creatorcontrib>Nguyen, Minh Hai ; Nguyen, Dung Le ; Nguyen, Xuan Mao ; Quan, Tho Thanh</creatorcontrib><description>To date, industrial antivirus tools are mostly using signature-based methods to detect malware occurrences. However, sophisticated malware, such as metamorphic or polymorphic virus, can effectively evade those tools by using some advanced obfuscation techniques, including mutation and the dynamically executed contents (DEC) methods, which dynamically produce new executable code in the run-time. Common DEC methods used by malware programs are packing or calling external code. In the research community, the approach of program analysis to detect suspicious behaviors has been emerging recently to handle this problem. Control flow graph (CFG) is a suitable representation to capture common behaviors from various mutated samples of virus. However, the current typical CFG forms generated by state-of-the-art binary analysis tools, such as IDA Pro, do not precisely reflect the behaviors of DEC methods. Moreover, this approach suffers from an extremely heavy cost to conduct and analyze the CFGs from binaries. This drawback causes the method of formal behavior analysis to be virtually not applicable with real-world applications.
In this paper, we propose an enhanced form of CFG, known as lazy-binding CFG to reflect the DEC behaviors. Then, with the recent advancement of the deep learning techniques, we present a method of producing image-based representation from the generated CFG. As deep learning is very popular to perform image classification on very large dataset, our proposed technique can be applied for malware detection on real-world computer programs and thus enjoying very high accuracy. We also illustrate our analysis results with some well-known malware samples, including WannaCry, Kasperagent and Sality, one of the most sophisticated polymorphic viruses.</description><identifier>ISSN: 0167-4048</identifier><identifier>EISSN: 1872-6208</identifier><identifier>DOI: 10.1016/j.cose.2018.02.006</identifier><language>eng</language><publisher>Amsterdam: Elsevier Ltd</publisher><subject>Anti-virus software ; Binary-based control ; Computer programming ; Computer viruses ; Deep learning ; Dynamically executed contents ; Flow graph ; Graphical representations ; Image classification ; Lazy-binding CFG ; Learning ; Malware ; Metamorphic virus ; Mutation ; Packing techniques ; Polymorphic virus ; Program verification (computers) ; Software ; Studies</subject><ispartof>Computers & security, 2018-07, Vol.76, p.128-155</ispartof><rights>2018 Elsevier Ltd</rights><rights>Copyright Elsevier Sequoia S.A. Jul 2018</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c328t-a43d0a884da89c47e9db33136fb8cadc651b01220bb6c00f050b2537d157ab783</citedby><cites>FETCH-LOGICAL-c328t-a43d0a884da89c47e9db33136fb8cadc651b01220bb6c00f050b2537d157ab783</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S0167404818300889$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3537,27901,27902,65534</link.rule.ids></links><search><creatorcontrib>Nguyen, Minh Hai</creatorcontrib><creatorcontrib>Nguyen, Dung Le</creatorcontrib><creatorcontrib>Nguyen, Xuan Mao</creatorcontrib><creatorcontrib>Quan, Tho Thanh</creatorcontrib><title>Auto-detection of sophisticated malware using lazy-binding control flow graph and deep learning</title><title>Computers & security</title><description>To date, industrial antivirus tools are mostly using signature-based methods to detect malware occurrences. However, sophisticated malware, such as metamorphic or polymorphic virus, can effectively evade those tools by using some advanced obfuscation techniques, including mutation and the dynamically executed contents (DEC) methods, which dynamically produce new executable code in the run-time. Common DEC methods used by malware programs are packing or calling external code. In the research community, the approach of program analysis to detect suspicious behaviors has been emerging recently to handle this problem. Control flow graph (CFG) is a suitable representation to capture common behaviors from various mutated samples of virus. However, the current typical CFG forms generated by state-of-the-art binary analysis tools, such as IDA Pro, do not precisely reflect the behaviors of DEC methods. Moreover, this approach suffers from an extremely heavy cost to conduct and analyze the CFGs from binaries. This drawback causes the method of formal behavior analysis to be virtually not applicable with real-world applications.
In this paper, we propose an enhanced form of CFG, known as lazy-binding CFG to reflect the DEC behaviors. Then, with the recent advancement of the deep learning techniques, we present a method of producing image-based representation from the generated CFG. As deep learning is very popular to perform image classification on very large dataset, our proposed technique can be applied for malware detection on real-world computer programs and thus enjoying very high accuracy. We also illustrate our analysis results with some well-known malware samples, including WannaCry, Kasperagent and Sality, one of the most sophisticated polymorphic viruses.</description><subject>Anti-virus software</subject><subject>Binary-based control</subject><subject>Computer programming</subject><subject>Computer viruses</subject><subject>Deep learning</subject><subject>Dynamically executed contents</subject><subject>Flow graph</subject><subject>Graphical representations</subject><subject>Image classification</subject><subject>Lazy-binding CFG</subject><subject>Learning</subject><subject>Malware</subject><subject>Metamorphic virus</subject><subject>Mutation</subject><subject>Packing techniques</subject><subject>Polymorphic virus</subject><subject>Program verification (computers)</subject><subject>Software</subject><subject>Studies</subject><issn>0167-4048</issn><issn>1872-6208</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNp9kMtqwzAQRUVpoenjB7oSdG13JPmhQDch9AWBbtq10MuJjGO5ktyQfn1t0nVXw8C5d4aD0B2BnACpHtpc-2hzCoTnQHOA6gwtCK9pVlHg52gxQXVWQMEv0VWMLQCpK84XSKzG5DNjk9XJ-R77Bkc_7FxMTstkDd7L7iCDxWN0_RZ38ueYKdebedG-T8F3uOn8AW-DHHZY9gYbawfcWRn6CbpBF43sor39m9fo8_npY_2abd5f3tarTaYZ5SmTBTMgOS-M5Etd1HZpFGOEVY3iWhpdlUQBoRSUqjRAAyUoWrLakLKWqubsGt2feofgv0Ybk2j9GPrppKCwLEoAVswUPVE6-BiDbcQQ3F6GoyAgZpGiFbNIMYsUQMUkcgo9nkJ2-v_b2SCidrbX1rgwWRPGu__iv7EsfQg</recordid><startdate>201807</startdate><enddate>201807</enddate><creator>Nguyen, Minh Hai</creator><creator>Nguyen, Dung Le</creator><creator>Nguyen, Xuan Mao</creator><creator>Quan, Tho Thanh</creator><general>Elsevier Ltd</general><general>Elsevier Sequoia S.A</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>K7.</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>201807</creationdate><title>Auto-detection of sophisticated malware using lazy-binding control flow graph and deep learning</title><author>Nguyen, Minh Hai ; Nguyen, Dung Le ; Nguyen, Xuan Mao ; Quan, Tho Thanh</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c328t-a43d0a884da89c47e9db33136fb8cadc651b01220bb6c00f050b2537d157ab783</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Anti-virus software</topic><topic>Binary-based control</topic><topic>Computer programming</topic><topic>Computer viruses</topic><topic>Deep learning</topic><topic>Dynamically executed contents</topic><topic>Flow graph</topic><topic>Graphical representations</topic><topic>Image classification</topic><topic>Lazy-binding CFG</topic><topic>Learning</topic><topic>Malware</topic><topic>Metamorphic virus</topic><topic>Mutation</topic><topic>Packing techniques</topic><topic>Polymorphic virus</topic><topic>Program verification (computers)</topic><topic>Software</topic><topic>Studies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Nguyen, Minh Hai</creatorcontrib><creatorcontrib>Nguyen, Dung Le</creatorcontrib><creatorcontrib>Nguyen, Xuan Mao</creatorcontrib><creatorcontrib>Quan, Tho Thanh</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Criminal Justice (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Computers & security</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Nguyen, Minh Hai</au><au>Nguyen, Dung Le</au><au>Nguyen, Xuan Mao</au><au>Quan, Tho Thanh</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Auto-detection of sophisticated malware using lazy-binding control flow graph and deep learning</atitle><jtitle>Computers & security</jtitle><date>2018-07</date><risdate>2018</risdate><volume>76</volume><spage>128</spage><epage>155</epage><pages>128-155</pages><issn>0167-4048</issn><eissn>1872-6208</eissn><abstract>To date, industrial antivirus tools are mostly using signature-based methods to detect malware occurrences. However, sophisticated malware, such as metamorphic or polymorphic virus, can effectively evade those tools by using some advanced obfuscation techniques, including mutation and the dynamically executed contents (DEC) methods, which dynamically produce new executable code in the run-time. Common DEC methods used by malware programs are packing or calling external code. In the research community, the approach of program analysis to detect suspicious behaviors has been emerging recently to handle this problem. Control flow graph (CFG) is a suitable representation to capture common behaviors from various mutated samples of virus. However, the current typical CFG forms generated by state-of-the-art binary analysis tools, such as IDA Pro, do not precisely reflect the behaviors of DEC methods. Moreover, this approach suffers from an extremely heavy cost to conduct and analyze the CFGs from binaries. This drawback causes the method of formal behavior analysis to be virtually not applicable with real-world applications.
In this paper, we propose an enhanced form of CFG, known as lazy-binding CFG to reflect the DEC behaviors. Then, with the recent advancement of the deep learning techniques, we present a method of producing image-based representation from the generated CFG. As deep learning is very popular to perform image classification on very large dataset, our proposed technique can be applied for malware detection on real-world computer programs and thus enjoying very high accuracy. We also illustrate our analysis results with some well-known malware samples, including WannaCry, Kasperagent and Sality, one of the most sophisticated polymorphic viruses.</abstract><cop>Amsterdam</cop><pub>Elsevier Ltd</pub><doi>10.1016/j.cose.2018.02.006</doi><tpages>28</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0167-4048 |
ispartof | Computers & security, 2018-07, Vol.76, p.128-155 |
issn | 0167-4048 1872-6208 |
language | eng |
recordid | cdi_proquest_journals_2094500348 |
source | Elsevier ScienceDirect Journals |
subjects | Anti-virus software Binary-based control Computer programming Computer viruses Deep learning Dynamically executed contents Flow graph Graphical representations Image classification Lazy-binding CFG Learning Malware Metamorphic virus Mutation Packing techniques Polymorphic virus Program verification (computers) Software Studies |
title | Auto-detection of sophisticated malware using lazy-binding control flow graph and deep learning |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-19T11%3A49%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Auto-detection%20of%20sophisticated%20malware%20using%20lazy-binding%20control%20flow%20graph%20and%20deep%20learning&rft.jtitle=Computers%20&%20security&rft.au=Nguyen,%20Minh%20Hai&rft.date=2018-07&rft.volume=76&rft.spage=128&rft.epage=155&rft.pages=128-155&rft.issn=0167-4048&rft.eissn=1872-6208&rft_id=info:doi/10.1016/j.cose.2018.02.006&rft_dat=%3Cproquest_cross%3E2094500348%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2094500348&rft_id=info:pmid/&rft_els_id=S0167404818300889&rfr_iscdi=true |