Malicious web content detection by machine learning

The recent development of the dynamic HTML gives attackers a new and powerful technique to compromise computer systems. A malicious dynamic HTML code is usually embedded in a normal webpage. The malicious webpage infects the victim when a user browses it. Furthermore, such DHTML code can disguise it...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Expert systems with applications 2010, Vol.37 (1), p.55-60
Hauptverfasser:	Hou, Yung-Tsung, Chang, Yimeng, Chen, Tsuhan, Laih, Chi-Sung, Chen, Chia-Mei
Format:	Artikel
Sprache:	eng
Schlagworte:	Dynamic HTML Dynamical systems Dynamics Expert systems HTML HyperText Markup Language Machine learning Malicious webpage Software packages Transformations
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	60
container_issue	1
container_start_page	55
container_title	Expert systems with applications
container_volume	37
creator	Hou, Yung-Tsung Chang, Yimeng Chen, Tsuhan Laih, Chi-Sung Chen, Chia-Mei
description	The recent development of the dynamic HTML gives attackers a new and powerful technique to compromise computer systems. A malicious dynamic HTML code is usually embedded in a normal webpage. The malicious webpage infects the victim when a user browses it. Furthermore, such DHTML code can disguise itself easily through obfuscation or transformation, which makes the detection even harder. Anti-virus software packages commonly use signature-based approaches which might not be able to efficiently identify camouflaged malicious HTML codes. Therefore, our paper proposes a malicious web page detection using the technique of machine learning. Our study analyzes the characteristic of a malicious webpage systematically and presents important features for machine learning. Experimental results demonstrate that our method is resilient to code obfuscations and can correctly determine whether a webpage is malicious or not.
doi_str_mv	10.1016/j.eswa.2009.05.023
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_34881505</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S095741740900445X</els_id><sourcerecordid>21071708</sourcerecordid><originalsourceid>FETCH-LOGICAL-c395t-c20f45139ba689622dc86b3dc2f61a42d3d6f5f94a926bbfe9e096591919d41b3</originalsourceid><addsrcrecordid>eNqFkD1PwzAQhi0EEuXjDzBlQiwJZzt2YokFIb6kIhaYLce-gKvUKXYK6r_HVZmLbrjleV_dPYRcUKgoUHm9qDD9mIoBqApEBYwfkBltG17KRvFDMgMlmrKmTX1MTlJaANAGoJkR_mIGb_24TsUPdoUdw4RhKhxOaCc_hqLbFEtjP33AYkATgw8fZ-SoN0PC8799St4f7t_unsr56-Pz3e28tFyJqbQM-lpQrjojWyUZc7aVHXeW9ZKamjnuZC96VRvFZNf1qBCUFIrmcTXt-Cm53PWu4vi1xjTppU8Wh8EEzAdrXrctFSD-BRmFJv_bZvBqL5ghCpIDbzLKdqiNY0oRe72KfmniRlPQW-d6obfO9da5BqGz8xy62YUwa_n2GHWyHoNF52P2qd3o98V_ASjsiQg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1701063037</pqid></control><display><type>article</type><title>Malicious web content detection by machine learning</title><source>Elsevier ScienceDirect Journals</source><creator>Hou, Yung-Tsung ; Chang, Yimeng ; Chen, Tsuhan ; Laih, Chi-Sung ; Chen, Chia-Mei</creator><creatorcontrib>Hou, Yung-Tsung ; Chang, Yimeng ; Chen, Tsuhan ; Laih, Chi-Sung ; Chen, Chia-Mei</creatorcontrib><description>The recent development of the dynamic HTML gives attackers a new and powerful technique to compromise computer systems. A malicious dynamic HTML code is usually embedded in a normal webpage. The malicious webpage infects the victim when a user browses it. Furthermore, such DHTML code can disguise itself easily through obfuscation or transformation, which makes the detection even harder. Anti-virus software packages commonly use signature-based approaches which might not be able to efficiently identify camouflaged malicious HTML codes. Therefore, our paper proposes a malicious web page detection using the technique of machine learning. Our study analyzes the characteristic of a malicious webpage systematically and presents important features for machine learning. Experimental results demonstrate that our method is resilient to code obfuscations and can correctly determine whether a webpage is malicious or not.</description><identifier>ISSN: 0957-4174</identifier><identifier>EISSN: 1873-6793</identifier><identifier>DOI: 10.1016/j.eswa.2009.05.023</identifier><language>eng</language><publisher>Elsevier Ltd</publisher><subject>Dynamic HTML ; Dynamical systems ; Dynamics ; Expert systems ; HTML ; HyperText Markup Language ; Machine learning ; Malicious webpage ; Software packages ; Transformations</subject><ispartof>Expert systems with applications, 2010, Vol.37 (1), p.55-60</ispartof><rights>2009 Elsevier Ltd</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c395t-c20f45139ba689622dc86b3dc2f61a42d3d6f5f94a926bbfe9e096591919d41b3</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S095741740900445X$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3537,4010,27900,27901,27902,65306</link.rule.ids></links><search><creatorcontrib>Hou, Yung-Tsung</creatorcontrib><creatorcontrib>Chang, Yimeng</creatorcontrib><creatorcontrib>Chen, Tsuhan</creatorcontrib><creatorcontrib>Laih, Chi-Sung</creatorcontrib><creatorcontrib>Chen, Chia-Mei</creatorcontrib><title>Malicious web content detection by machine learning</title><title>Expert systems with applications</title><description>The recent development of the dynamic HTML gives attackers a new and powerful technique to compromise computer systems. A malicious dynamic HTML code is usually embedded in a normal webpage. The malicious webpage infects the victim when a user browses it. Furthermore, such DHTML code can disguise itself easily through obfuscation or transformation, which makes the detection even harder. Anti-virus software packages commonly use signature-based approaches which might not be able to efficiently identify camouflaged malicious HTML codes. Therefore, our paper proposes a malicious web page detection using the technique of machine learning. Our study analyzes the characteristic of a malicious webpage systematically and presents important features for machine learning. Experimental results demonstrate that our method is resilient to code obfuscations and can correctly determine whether a webpage is malicious or not.</description><subject>Dynamic HTML</subject><subject>Dynamical systems</subject><subject>Dynamics</subject><subject>Expert systems</subject><subject>HTML</subject><subject>HyperText Markup Language</subject><subject>Machine learning</subject><subject>Malicious webpage</subject><subject>Software packages</subject><subject>Transformations</subject><issn>0957-4174</issn><issn>1873-6793</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2010</creationdate><recordtype>article</recordtype><recordid>eNqFkD1PwzAQhi0EEuXjDzBlQiwJZzt2YokFIb6kIhaYLce-gKvUKXYK6r_HVZmLbrjleV_dPYRcUKgoUHm9qDD9mIoBqApEBYwfkBltG17KRvFDMgMlmrKmTX1MTlJaANAGoJkR_mIGb_24TsUPdoUdw4RhKhxOaCc_hqLbFEtjP33AYkATgw8fZ-SoN0PC8799St4f7t_unsr56-Pz3e28tFyJqbQM-lpQrjojWyUZc7aVHXeW9ZKamjnuZC96VRvFZNf1qBCUFIrmcTXt-Cm53PWu4vi1xjTppU8Wh8EEzAdrXrctFSD-BRmFJv_bZvBqL5ghCpIDbzLKdqiNY0oRe72KfmniRlPQW-d6obfO9da5BqGz8xy62YUwa_n2GHWyHoNF52P2qd3o98V_ASjsiQg</recordid><startdate>2010</startdate><enddate>2010</enddate><creator>Hou, Yung-Tsung</creator><creator>Chang, Yimeng</creator><creator>Chen, Tsuhan</creator><creator>Laih, Chi-Sung</creator><creator>Chen, Chia-Mei</creator><general>Elsevier Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7U9</scope><scope>H94</scope></search><sort><creationdate>2010</creationdate><title>Malicious web content detection by machine learning</title><author>Hou, Yung-Tsung ; Chang, Yimeng ; Chen, Tsuhan ; Laih, Chi-Sung ; Chen, Chia-Mei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c395t-c20f45139ba689622dc86b3dc2f61a42d3d6f5f94a926bbfe9e096591919d41b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2010</creationdate><topic>Dynamic HTML</topic><topic>Dynamical systems</topic><topic>Dynamics</topic><topic>Expert systems</topic><topic>HTML</topic><topic>HyperText Markup Language</topic><topic>Machine learning</topic><topic>Malicious webpage</topic><topic>Software packages</topic><topic>Transformations</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hou, Yung-Tsung</creatorcontrib><creatorcontrib>Chang, Yimeng</creatorcontrib><creatorcontrib>Chen, Tsuhan</creatorcontrib><creatorcontrib>Laih, Chi-Sung</creatorcontrib><creatorcontrib>Chen, Chia-Mei</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Virology and AIDS Abstracts</collection><collection>AIDS and Cancer Research Abstracts</collection><jtitle>Expert systems with applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hou, Yung-Tsung</au><au>Chang, Yimeng</au><au>Chen, Tsuhan</au><au>Laih, Chi-Sung</au><au>Chen, Chia-Mei</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Malicious web content detection by machine learning</atitle><jtitle>Expert systems with applications</jtitle><date>2010</date><risdate>2010</risdate><volume>37</volume><issue>1</issue><spage>55</spage><epage>60</epage><pages>55-60</pages><issn>0957-4174</issn><eissn>1873-6793</eissn><abstract>The recent development of the dynamic HTML gives attackers a new and powerful technique to compromise computer systems. A malicious dynamic HTML code is usually embedded in a normal webpage. The malicious webpage infects the victim when a user browses it. Furthermore, such DHTML code can disguise itself easily through obfuscation or transformation, which makes the detection even harder. Anti-virus software packages commonly use signature-based approaches which might not be able to efficiently identify camouflaged malicious HTML codes. Therefore, our paper proposes a malicious web page detection using the technique of machine learning. Our study analyzes the characteristic of a malicious webpage systematically and presents important features for machine learning. Experimental results demonstrate that our method is resilient to code obfuscations and can correctly determine whether a webpage is malicious or not.</abstract><pub>Elsevier Ltd</pub><doi>10.1016/j.eswa.2009.05.023</doi><tpages>6</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0957-4174
ispartof	Expert systems with applications, 2010, Vol.37 (1), p.55-60
issn	0957-4174 1873-6793
language	eng
recordid	cdi_proquest_miscellaneous_34881505
source	Elsevier ScienceDirect Journals
subjects	Dynamic HTML Dynamical systems Dynamics Expert systems HTML HyperText Markup Language Machine learning Malicious webpage Software packages Transformations
title	Malicious web content detection by machine learning
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T00%3A05%3A12IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Malicious%20web%20content%20detection%20by%20machine%20learning&rft.jtitle=Expert%20systems%20with%20applications&rft.au=Hou,%20Yung-Tsung&rft.date=2010&rft.volume=37&rft.issue=1&rft.spage=55&rft.epage=60&rft.pages=55-60&rft.issn=0957-4174&rft.eissn=1873-6793&rft_id=info:doi/10.1016/j.eswa.2009.05.023&rft_dat=%3Cproquest_cross%3E21071708%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1701063037&rft_id=info:pmid/&rft_els_id=S095741740900445X&rfr_iscdi=true