Malicious web content detection by machine learning

The recent development of the dynamic HTML gives attackers a new and powerful technique to compromise computer systems. A malicious dynamic HTML code is usually embedded in a normal webpage. The malicious webpage infects the victim when a user browses it. Furthermore, such DHTML code can disguise it...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Expert systems with applications 2010, Vol.37 (1), p.55-60
Hauptverfasser:	Hou, Yung-Tsung, Chang, Yimeng, Chen, Tsuhan, Laih, Chi-Sung, Chen, Chia-Mei
Format:	Artikel
Sprache:	eng
Schlagworte:	Dynamic HTML Dynamical systems Dynamics Expert systems HTML HyperText Markup Language Machine learning Malicious webpage Software packages Transformations
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The recent development of the dynamic HTML gives attackers a new and powerful technique to compromise computer systems. A malicious dynamic HTML code is usually embedded in a normal webpage. The malicious webpage infects the victim when a user browses it. Furthermore, such DHTML code can disguise itself easily through obfuscation or transformation, which makes the detection even harder. Anti-virus software packages commonly use signature-based approaches which might not be able to efficiently identify camouflaged malicious HTML codes. Therefore, our paper proposes a malicious web page detection using the technique of machine learning. Our study analyzes the characteristic of a malicious webpage systematically and presents important features for machine learning. Experimental results demonstrate that our method is resilient to code obfuscations and can correctly determine whether a webpage is malicious or not.
ISSN:	0957-4174 1873-6793
DOI:	10.1016/j.eswa.2009.05.023