New filtering approaches for phishing email

Phishing emails usually contain a message from a credible looking source requesting a user to click a link to a website where she/he is asked to enter a password or other confidential information. Most phishing emails aim at withdrawing money from financial institutions or getting access to private...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of computer security 2010-01, Vol.18 (1), p.7-35
Hauptverfasser: Bergholz, André, De Beer, Jan, Glahn, Sebastian, Moens, Marie-Francine, Paaß, Gerhard, Strobel, Siehyun
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 35
container_issue 1
container_start_page 7
container_title Journal of computer security
container_volume 18
creator Bergholz, André
De Beer, Jan
Glahn, Sebastian
Moens, Marie-Francine
Paaß, Gerhard
Strobel, Siehyun
description Phishing emails usually contain a message from a credible looking source requesting a user to click a link to a website where she/he is asked to enter a password or other confidential information. Most phishing emails aim at withdrawing money from financial institutions or getting access to private information. Phishing has increased enormously over the last years and is a serious threat to global security and economy. There are a number of possible countermeasures to phishing. These range from communication-oriented approaches like authentication protocols over blacklisting to content-based filtering approaches.We argue that the first two approaches are currently not broadly implemented or exhibit deficits. Therefore content-based phishing filters are necessary and widely used to increase communication security. A number of features are extracted capturing the content and structural properties of the email. Subsequently a statistical classifier is trained using these features on a training set of emails labeled as ham (legitimate), spam or phishing. This classifier may then be applied to an email stream to estimate the classes of new incoming emails.In this paper we describe a number of novel features that are particularly well-suited to identify phishing emails. These include statistical models for the low-dimensional descriptions of email topics, sequential analysis of email text and external links, the detection of embedded logos as well as indicators for hidden salting. Hidden salting is the intentional addition or distortion of content not perceivable by the reader. For empirical evaluation we have obtained a large realistic corpus of emails prelabeled as spam, phishing, and ham (legitimate). In experiments our methods outperform other published approaches for classifying phishing emails. We discuss the implications of these results for the practical application of this approach in the workflow of an email provider. Finally we describe a strategy how the filters may be updated and adapted to new types of phishing.
doi_str_mv 10.3233/JCS-2010-0371
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_36337154</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>36337154</sourcerecordid><originalsourceid>FETCH-LOGICAL-c312t-30e5cf905a5ada77e91ecfd360e3b0f0ad7c1ada033e32986a69d9715812d5443</originalsourceid><addsrcrecordid>eNotkEFLxDAQhYMouK4evffkRaKTzKZpjrLoqix6UMFbiOnEjbTbmnQR_70t6-nBe4_HzMfYuYArlIjXj8sXLkEAB9TigM1EpRWvjFwcshkYWXIp9fsxO8n5C0AKYaoZu3yinyLEZqAUt5-F6_vUOb-hXIQuFf0m5s3kU-tic8qOgmsynf3rnL3d3b4u7_n6efWwvFlzj0IOHIGUDwaUU652WpMR5EONJRB-QABXay_GBBAJpalKV5raaKEqIWu1WOCcXex3x1u-d5QH28bsqWnclrpdtlji-KCainxf9KnLOVGwfYqtS79WgJ2Q2BGJnZDYCQn-AeAAUwI</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>36337154</pqid></control><display><type>article</type><title>New filtering approaches for phishing email</title><source>Business Source Complete</source><creator>Bergholz, André ; De Beer, Jan ; Glahn, Sebastian ; Moens, Marie-Francine ; Paaß, Gerhard ; Strobel, Siehyun</creator><contributor>Skordas, Thomas ; Lopez, Javier ; Camenisch, Jan ; Massacci, Fabio ; Ciscato, Massimo</contributor><creatorcontrib>Bergholz, André ; De Beer, Jan ; Glahn, Sebastian ; Moens, Marie-Francine ; Paaß, Gerhard ; Strobel, Siehyun ; Skordas, Thomas ; Lopez, Javier ; Camenisch, Jan ; Massacci, Fabio ; Ciscato, Massimo</creatorcontrib><description>Phishing emails usually contain a message from a credible looking source requesting a user to click a link to a website where she/he is asked to enter a password or other confidential information. Most phishing emails aim at withdrawing money from financial institutions or getting access to private information. Phishing has increased enormously over the last years and is a serious threat to global security and economy. There are a number of possible countermeasures to phishing. These range from communication-oriented approaches like authentication protocols over blacklisting to content-based filtering approaches.We argue that the first two approaches are currently not broadly implemented or exhibit deficits. Therefore content-based phishing filters are necessary and widely used to increase communication security. A number of features are extracted capturing the content and structural properties of the email. Subsequently a statistical classifier is trained using these features on a training set of emails labeled as ham (legitimate), spam or phishing. This classifier may then be applied to an email stream to estimate the classes of new incoming emails.In this paper we describe a number of novel features that are particularly well-suited to identify phishing emails. These include statistical models for the low-dimensional descriptions of email topics, sequential analysis of email text and external links, the detection of embedded logos as well as indicators for hidden salting. Hidden salting is the intentional addition or distortion of content not perceivable by the reader. For empirical evaluation we have obtained a large realistic corpus of emails prelabeled as spam, phishing, and ham (legitimate). In experiments our methods outperform other published approaches for classifying phishing emails. We discuss the implications of these results for the practical application of this approach in the workflow of an email provider. Finally we describe a strategy how the filters may be updated and adapted to new types of phishing.</description><identifier>ISSN: 0926-227X</identifier><identifier>EISSN: 1875-8924</identifier><identifier>DOI: 10.3233/JCS-2010-0371</identifier><language>eng</language><ispartof>Journal of computer security, 2010-01, Vol.18 (1), p.7-35</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c312t-30e5cf905a5ada77e91ecfd360e3b0f0ad7c1ada033e32986a69d9715812d5443</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27915,27916</link.rule.ids></links><search><contributor>Skordas, Thomas</contributor><contributor>Lopez, Javier</contributor><contributor>Camenisch, Jan</contributor><contributor>Massacci, Fabio</contributor><contributor>Ciscato, Massimo</contributor><creatorcontrib>Bergholz, André</creatorcontrib><creatorcontrib>De Beer, Jan</creatorcontrib><creatorcontrib>Glahn, Sebastian</creatorcontrib><creatorcontrib>Moens, Marie-Francine</creatorcontrib><creatorcontrib>Paaß, Gerhard</creatorcontrib><creatorcontrib>Strobel, Siehyun</creatorcontrib><title>New filtering approaches for phishing email</title><title>Journal of computer security</title><description>Phishing emails usually contain a message from a credible looking source requesting a user to click a link to a website where she/he is asked to enter a password or other confidential information. Most phishing emails aim at withdrawing money from financial institutions or getting access to private information. Phishing has increased enormously over the last years and is a serious threat to global security and economy. There are a number of possible countermeasures to phishing. These range from communication-oriented approaches like authentication protocols over blacklisting to content-based filtering approaches.We argue that the first two approaches are currently not broadly implemented or exhibit deficits. Therefore content-based phishing filters are necessary and widely used to increase communication security. A number of features are extracted capturing the content and structural properties of the email. Subsequently a statistical classifier is trained using these features on a training set of emails labeled as ham (legitimate), spam or phishing. This classifier may then be applied to an email stream to estimate the classes of new incoming emails.In this paper we describe a number of novel features that are particularly well-suited to identify phishing emails. These include statistical models for the low-dimensional descriptions of email topics, sequential analysis of email text and external links, the detection of embedded logos as well as indicators for hidden salting. Hidden salting is the intentional addition or distortion of content not perceivable by the reader. For empirical evaluation we have obtained a large realistic corpus of emails prelabeled as spam, phishing, and ham (legitimate). In experiments our methods outperform other published approaches for classifying phishing emails. We discuss the implications of these results for the practical application of this approach in the workflow of an email provider. Finally we describe a strategy how the filters may be updated and adapted to new types of phishing.</description><issn>0926-227X</issn><issn>1875-8924</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2010</creationdate><recordtype>article</recordtype><recordid>eNotkEFLxDAQhYMouK4evffkRaKTzKZpjrLoqix6UMFbiOnEjbTbmnQR_70t6-nBe4_HzMfYuYArlIjXj8sXLkEAB9TigM1EpRWvjFwcshkYWXIp9fsxO8n5C0AKYaoZu3yinyLEZqAUt5-F6_vUOb-hXIQuFf0m5s3kU-tic8qOgmsynf3rnL3d3b4u7_n6efWwvFlzj0IOHIGUDwaUU652WpMR5EONJRB-QABXay_GBBAJpalKV5raaKEqIWu1WOCcXex3x1u-d5QH28bsqWnclrpdtlji-KCainxf9KnLOVGwfYqtS79WgJ2Q2BGJnZDYCQn-AeAAUwI</recordid><startdate>20100101</startdate><enddate>20100101</enddate><creator>Bergholz, André</creator><creator>De Beer, Jan</creator><creator>Glahn, Sebastian</creator><creator>Moens, Marie-Francine</creator><creator>Paaß, Gerhard</creator><creator>Strobel, Siehyun</creator><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20100101</creationdate><title>New filtering approaches for phishing email</title><author>Bergholz, André ; De Beer, Jan ; Glahn, Sebastian ; Moens, Marie-Francine ; Paaß, Gerhard ; Strobel, Siehyun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c312t-30e5cf905a5ada77e91ecfd360e3b0f0ad7c1ada033e32986a69d9715812d5443</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2010</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Bergholz, André</creatorcontrib><creatorcontrib>De Beer, Jan</creatorcontrib><creatorcontrib>Glahn, Sebastian</creatorcontrib><creatorcontrib>Moens, Marie-Francine</creatorcontrib><creatorcontrib>Paaß, Gerhard</creatorcontrib><creatorcontrib>Strobel, Siehyun</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Journal of computer security</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Bergholz, André</au><au>De Beer, Jan</au><au>Glahn, Sebastian</au><au>Moens, Marie-Francine</au><au>Paaß, Gerhard</au><au>Strobel, Siehyun</au><au>Skordas, Thomas</au><au>Lopez, Javier</au><au>Camenisch, Jan</au><au>Massacci, Fabio</au><au>Ciscato, Massimo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>New filtering approaches for phishing email</atitle><jtitle>Journal of computer security</jtitle><date>2010-01-01</date><risdate>2010</risdate><volume>18</volume><issue>1</issue><spage>7</spage><epage>35</epage><pages>7-35</pages><issn>0926-227X</issn><eissn>1875-8924</eissn><abstract>Phishing emails usually contain a message from a credible looking source requesting a user to click a link to a website where she/he is asked to enter a password or other confidential information. Most phishing emails aim at withdrawing money from financial institutions or getting access to private information. Phishing has increased enormously over the last years and is a serious threat to global security and economy. There are a number of possible countermeasures to phishing. These range from communication-oriented approaches like authentication protocols over blacklisting to content-based filtering approaches.We argue that the first two approaches are currently not broadly implemented or exhibit deficits. Therefore content-based phishing filters are necessary and widely used to increase communication security. A number of features are extracted capturing the content and structural properties of the email. Subsequently a statistical classifier is trained using these features on a training set of emails labeled as ham (legitimate), spam or phishing. This classifier may then be applied to an email stream to estimate the classes of new incoming emails.In this paper we describe a number of novel features that are particularly well-suited to identify phishing emails. These include statistical models for the low-dimensional descriptions of email topics, sequential analysis of email text and external links, the detection of embedded logos as well as indicators for hidden salting. Hidden salting is the intentional addition or distortion of content not perceivable by the reader. For empirical evaluation we have obtained a large realistic corpus of emails prelabeled as spam, phishing, and ham (legitimate). In experiments our methods outperform other published approaches for classifying phishing emails. We discuss the implications of these results for the practical application of this approach in the workflow of an email provider. Finally we describe a strategy how the filters may be updated and adapted to new types of phishing.</abstract><doi>10.3233/JCS-2010-0371</doi><tpages>29</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0926-227X
ispartof Journal of computer security, 2010-01, Vol.18 (1), p.7-35
issn 0926-227X
1875-8924
language eng
recordid cdi_proquest_miscellaneous_36337154
source Business Source Complete
title New filtering approaches for phishing email
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-14T20%3A24%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=New%20filtering%20approaches%20for%20phishing%20email&rft.jtitle=Journal%20of%20computer%20security&rft.au=Bergholz,%20Andr%C3%A9&rft.date=2010-01-01&rft.volume=18&rft.issue=1&rft.spage=7&rft.epage=35&rft.pages=7-35&rft.issn=0926-227X&rft.eissn=1875-8924&rft_id=info:doi/10.3233/JCS-2010-0371&rft_dat=%3Cproquest_cross%3E36337154%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=36337154&rft_id=info:pmid/&rfr_iscdi=true