Understanding phishers' strategies of mimicking uniform resource locators to leverage phishing attacks: A machine learning approach

Phishing is a type of social engineering attack with an intention to steal user data, including login credentials and credit card numbers, leading to financial losses for both organisations and individuals. It occurs when an attacker, pretending as a trusted entity, lure a victim into click on a lin...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2020-07
Hauptverfasser:	Tharani, J Samantha, Nalin Asanka Gamagedara Arachchilage
Format:	Artikel
Sprache:	eng
Schlagworte:	Access control Datasets Feature extraction Identity theft Links Machine learning Messages Phishing Short message service Social networks URLs Websites
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Phishing is a type of social engineering attack with an intention to steal user data, including login credentials and credit card numbers, leading to financial losses for both organisations and individuals. It occurs when an attacker, pretending as a trusted entity, lure a victim into click on a link or attachment in an email, or in a text message. Phishing is often launched via email messages or text messages over social networks. Previous research has revealed that phishing attacks can be identified just by looking at URLs. Identifying the techniques which are used by phishers to mimic a phishing URL is rather a challenging issue. At present, we have limited knowledge and understanding of how cybercriminals attempt to mimic URLs with the same look and feel of the legitimate ones, to entice people into clicking links. Therefore, this paper investigates the feature selection of phishing URLs (Uniform Resource Locators), aiming to explore the strategies employed by phishers to mimic URLs that can obviously trick people into clicking links. We employed an Information Gain (IG) and Chi-Squared feature selection methods in Machine Learning (ML) on a phishing dataset. The dataset contains a total of 48 features extracted from 5000 phishing and another 5000 legitimate URL from web pages downloaded from January to May 2015 and from May to June 2017. Our results revealed that there were 10 techniques that phishers used to mimic URLs to manipulate humans into clicking links. Identifying these phishing URL manipulation techniques would certainly help to educate individuals and organisations and keep them safe from phishing attacks. In addition, the findings of this research will also help develop anti-phishing tools, framework or browser plugins for phishing prevention.
ISSN:	2331-8422