An improved ELM-based and data preprocessing integrated approach for phishing detection considering comprehensive features

•Define three types of features extracted from URLs, domains, etc.•Exploit a method to balance the majority and minority class samples.•Adopt an improved DAE-based method to reduce the dimension of the dataset.•Boost the detection performance by using the improved ELM-based classifier.•Do experiment...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Expert systems with applications 2021-03, Vol.165, p.113863, Article 113863
Hauptverfasser:	Yang, Liqun, Zhang, Jiawei, Wang, Xiaozhe, Li, Zhi, Li, Zhoujun, He, Yueying
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptive algorithms Adaptive sampling ADASYN Artificial neural networks Coders Dimension reduction Extreme learning machine (ELM) Machine learning Noise reduction Non-inverse matrix Phishing Phishing detection SDAE Training Websites
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page	113863
container_title	Expert systems with applications
container_volume	165
creator	Yang, Liqun Zhang, Jiawei Wang, Xiaozhe Li, Zhi Li, Zhoujun He, Yueying
description	•Define three types of features extracted from URLs, domains, etc.•Exploit a method to balance the majority and minority class samples.•Adopt an improved DAE-based method to reduce the dimension of the dataset.•Boost the detection performance by using the improved ELM-based classifier.•Do experiments to verify the feasibility and effectiveness of the proposed approach. In this paper, a novel approach based on non-inverse matrix online sequence extreme learning machine (NIOSELM) for phishing detection is presented, which takes into account three types of features to comprehensively characterize a website. For the NIOSELM algorithm, we use Sherman Morriso Woodbury equation to avoid the matrix inversion operation, and introduce the idea of online sequence extreme learning machine (OSELM) to update the training model. In order to reduce the dependence of the detection model on the majority class, we use Adaptive Synthetic Sampling (ADASYN) algorithm to generate the synthetic minority class samples to balance the distribution between the samples of the majority and minority classes. Furthermore, an improved denoising auto-encoder (SDAE) is designed to reduce the dimension of the experimental dataset. The experimental results show the efficiency and feasibility of the proposed detection mechanism. Moreover, the overall detection performance of NIOSELM is better than that of other existing methods, especially in training speed and the detection accuracy.
doi_str_mv	10.1016/j.eswa.2020.113863
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2487170018</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0957417420306734</els_id><sourcerecordid>2487170018</sourcerecordid><originalsourceid>FETCH-LOGICAL-c328t-e3f532913d198bb646cdd1d2cdcba6104cbb12d08a6a4ec85d06146587e26b73</originalsourceid><addsrcrecordid>eNp9kMtOwzAQRS0EEuXxA6wssU7xI7EdiU1VlYdUxKZ7y7EnjSOaBDstgq_HUVizmtGde2dGB6E7SpaUUPHQLiF-mSUjLAmUK8HP0IIqyTMhS36OFqQsZJZTmV-iqxhbQqgkRC7Qz6rD_jCE_gQOb7ZvWWVi6kznsDOjwUOANLQQo-_22Hcj7IMZJ8eQdGMbXPcBD42PzWRwMIIdfd9h23fROwiTavt0ARpIyglwDWY8Bog36KI2HxFu_-o12j1tduuXbPv-_LpebTPLmRoz4HXBWUm5o6WqKpEL6xx1zDpbGUFJbquKMkeUESYHqwpHBM1FoSQwUUl-je7ntenfzyPEUbf9MXTpoma5khMHqpKLzS4b-hgD1HoI_mDCt6ZET4h1qyfEekKsZ8Qp9DiHIL1_8hB0tB46C86HhEG73v8X_wWL-odc</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2487170018</pqid></control><display><type>article</type><title>An improved ELM-based and data preprocessing integrated approach for phishing detection considering comprehensive features</title><source>Elsevier ScienceDirect Journals Complete - AutoHoldings</source><creator>Yang, Liqun ; Zhang, Jiawei ; Wang, Xiaozhe ; Li, Zhi ; Li, Zhoujun ; He, Yueying</creator><creatorcontrib>Yang, Liqun ; Zhang, Jiawei ; Wang, Xiaozhe ; Li, Zhi ; Li, Zhoujun ; He, Yueying</creatorcontrib><description>•Define three types of features extracted from URLs, domains, etc.•Exploit a method to balance the majority and minority class samples.•Adopt an improved DAE-based method to reduce the dimension of the dataset.•Boost the detection performance by using the improved ELM-based classifier.•Do experiments to verify the feasibility and effectiveness of the proposed approach. In this paper, a novel approach based on non-inverse matrix online sequence extreme learning machine (NIOSELM) for phishing detection is presented, which takes into account three types of features to comprehensively characterize a website. For the NIOSELM algorithm, we use Sherman Morriso Woodbury equation to avoid the matrix inversion operation, and introduce the idea of online sequence extreme learning machine (OSELM) to update the training model. In order to reduce the dependence of the detection model on the majority class, we use Adaptive Synthetic Sampling (ADASYN) algorithm to generate the synthetic minority class samples to balance the distribution between the samples of the majority and minority classes. Furthermore, an improved denoising auto-encoder (SDAE) is designed to reduce the dimension of the experimental dataset. The experimental results show the efficiency and feasibility of the proposed detection mechanism. Moreover, the overall detection performance of NIOSELM is better than that of other existing methods, especially in training speed and the detection accuracy.</description><identifier>ISSN: 0957-4174</identifier><identifier>EISSN: 1873-6793</identifier><identifier>DOI: 10.1016/j.eswa.2020.113863</identifier><language>eng</language><publisher>New York: Elsevier Ltd</publisher><subject>Adaptive algorithms ; Adaptive sampling ; ADASYN ; Artificial neural networks ; Coders ; Dimension reduction ; Extreme learning machine (ELM) ; Machine learning ; Noise reduction ; Non-inverse matrix ; Phishing ; Phishing detection ; SDAE ; Training ; Websites</subject><ispartof>Expert systems with applications, 2021-03, Vol.165, p.113863, Article 113863</ispartof><rights>2020 Elsevier Ltd</rights><rights>Copyright Elsevier BV Mar 1, 2021</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c328t-e3f532913d198bb646cdd1d2cdcba6104cbb12d08a6a4ec85d06146587e26b73</citedby><cites>FETCH-LOGICAL-c328t-e3f532913d198bb646cdd1d2cdcba6104cbb12d08a6a4ec85d06146587e26b73</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.eswa.2020.113863$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,778,782,3539,27907,27908,45978</link.rule.ids></links><search><creatorcontrib>Yang, Liqun</creatorcontrib><creatorcontrib>Zhang, Jiawei</creatorcontrib><creatorcontrib>Wang, Xiaozhe</creatorcontrib><creatorcontrib>Li, Zhi</creatorcontrib><creatorcontrib>Li, Zhoujun</creatorcontrib><creatorcontrib>He, Yueying</creatorcontrib><title>An improved ELM-based and data preprocessing integrated approach for phishing detection considering comprehensive features</title><title>Expert systems with applications</title><description>•Define three types of features extracted from URLs, domains, etc.•Exploit a method to balance the majority and minority class samples.•Adopt an improved DAE-based method to reduce the dimension of the dataset.•Boost the detection performance by using the improved ELM-based classifier.•Do experiments to verify the feasibility and effectiveness of the proposed approach. In this paper, a novel approach based on non-inverse matrix online sequence extreme learning machine (NIOSELM) for phishing detection is presented, which takes into account three types of features to comprehensively characterize a website. For the NIOSELM algorithm, we use Sherman Morriso Woodbury equation to avoid the matrix inversion operation, and introduce the idea of online sequence extreme learning machine (OSELM) to update the training model. In order to reduce the dependence of the detection model on the majority class, we use Adaptive Synthetic Sampling (ADASYN) algorithm to generate the synthetic minority class samples to balance the distribution between the samples of the majority and minority classes. Furthermore, an improved denoising auto-encoder (SDAE) is designed to reduce the dimension of the experimental dataset. The experimental results show the efficiency and feasibility of the proposed detection mechanism. Moreover, the overall detection performance of NIOSELM is better than that of other existing methods, especially in training speed and the detection accuracy.</description><subject>Adaptive algorithms</subject><subject>Adaptive sampling</subject><subject>ADASYN</subject><subject>Artificial neural networks</subject><subject>Coders</subject><subject>Dimension reduction</subject><subject>Extreme learning machine (ELM)</subject><subject>Machine learning</subject><subject>Noise reduction</subject><subject>Non-inverse matrix</subject><subject>Phishing</subject><subject>Phishing detection</subject><subject>SDAE</subject><subject>Training</subject><subject>Websites</subject><issn>0957-4174</issn><issn>1873-6793</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNp9kMtOwzAQRS0EEuXxA6wssU7xI7EdiU1VlYdUxKZ7y7EnjSOaBDstgq_HUVizmtGde2dGB6E7SpaUUPHQLiF-mSUjLAmUK8HP0IIqyTMhS36OFqQsZJZTmV-iqxhbQqgkRC7Qz6rD_jCE_gQOb7ZvWWVi6kznsDOjwUOANLQQo-_22Hcj7IMZJ8eQdGMbXPcBD42PzWRwMIIdfd9h23fROwiTavt0ARpIyglwDWY8Bog36KI2HxFu_-o12j1tduuXbPv-_LpebTPLmRoz4HXBWUm5o6WqKpEL6xx1zDpbGUFJbquKMkeUESYHqwpHBM1FoSQwUUl-je7ntenfzyPEUbf9MXTpoma5khMHqpKLzS4b-hgD1HoI_mDCt6ZET4h1qyfEekKsZ8Qp9DiHIL1_8hB0tB46C86HhEG73v8X_wWL-odc</recordid><startdate>20210301</startdate><enddate>20210301</enddate><creator>Yang, Liqun</creator><creator>Zhang, Jiawei</creator><creator>Wang, Xiaozhe</creator><creator>Li, Zhi</creator><creator>Li, Zhoujun</creator><creator>He, Yueying</creator><general>Elsevier Ltd</general><general>Elsevier BV</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20210301</creationdate><title>An improved ELM-based and data preprocessing integrated approach for phishing detection considering comprehensive features</title><author>Yang, Liqun ; Zhang, Jiawei ; Wang, Xiaozhe ; Li, Zhi ; Li, Zhoujun ; He, Yueying</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c328t-e3f532913d198bb646cdd1d2cdcba6104cbb12d08a6a4ec85d06146587e26b73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Adaptive algorithms</topic><topic>Adaptive sampling</topic><topic>ADASYN</topic><topic>Artificial neural networks</topic><topic>Coders</topic><topic>Dimension reduction</topic><topic>Extreme learning machine (ELM)</topic><topic>Machine learning</topic><topic>Noise reduction</topic><topic>Non-inverse matrix</topic><topic>Phishing</topic><topic>Phishing detection</topic><topic>SDAE</topic><topic>Training</topic><topic>Websites</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yang, Liqun</creatorcontrib><creatorcontrib>Zhang, Jiawei</creatorcontrib><creatorcontrib>Wang, Xiaozhe</creatorcontrib><creatorcontrib>Li, Zhi</creatorcontrib><creatorcontrib>Li, Zhoujun</creatorcontrib><creatorcontrib>He, Yueying</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Expert systems with applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yang, Liqun</au><au>Zhang, Jiawei</au><au>Wang, Xiaozhe</au><au>Li, Zhi</au><au>Li, Zhoujun</au><au>He, Yueying</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An improved ELM-based and data preprocessing integrated approach for phishing detection considering comprehensive features</atitle><jtitle>Expert systems with applications</jtitle><date>2021-03-01</date><risdate>2021</risdate><volume>165</volume><spage>113863</spage><pages>113863-</pages><artnum>113863</artnum><issn>0957-4174</issn><eissn>1873-6793</eissn><abstract>•Define three types of features extracted from URLs, domains, etc.•Exploit a method to balance the majority and minority class samples.•Adopt an improved DAE-based method to reduce the dimension of the dataset.•Boost the detection performance by using the improved ELM-based classifier.•Do experiments to verify the feasibility and effectiveness of the proposed approach. In this paper, a novel approach based on non-inverse matrix online sequence extreme learning machine (NIOSELM) for phishing detection is presented, which takes into account three types of features to comprehensively characterize a website. For the NIOSELM algorithm, we use Sherman Morriso Woodbury equation to avoid the matrix inversion operation, and introduce the idea of online sequence extreme learning machine (OSELM) to update the training model. In order to reduce the dependence of the detection model on the majority class, we use Adaptive Synthetic Sampling (ADASYN) algorithm to generate the synthetic minority class samples to balance the distribution between the samples of the majority and minority classes. Furthermore, an improved denoising auto-encoder (SDAE) is designed to reduce the dimension of the experimental dataset. The experimental results show the efficiency and feasibility of the proposed detection mechanism. Moreover, the overall detection performance of NIOSELM is better than that of other existing methods, especially in training speed and the detection accuracy.</abstract><cop>New York</cop><pub>Elsevier Ltd</pub><doi>10.1016/j.eswa.2020.113863</doi></addata></record>
fulltext	fulltext
identifier	ISSN: 0957-4174
ispartof	Expert systems with applications, 2021-03, Vol.165, p.113863, Article 113863
issn	0957-4174 1873-6793
language	eng
recordid	cdi_proquest_journals_2487170018
source	Elsevier ScienceDirect Journals Complete - AutoHoldings
subjects	Adaptive algorithms Adaptive sampling ADASYN Artificial neural networks Coders Dimension reduction Extreme learning machine (ELM) Machine learning Noise reduction Non-inverse matrix Phishing Phishing detection SDAE Training Websites
title	An improved ELM-based and data preprocessing integrated approach for phishing detection considering comprehensive features
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T08%3A25%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20improved%20ELM-based%20and%20data%20preprocessing%20integrated%20approach%20for%20phishing%20detection%20considering%20comprehensive%20features&rft.jtitle=Expert%20systems%20with%20applications&rft.au=Yang,%20Liqun&rft.date=2021-03-01&rft.volume=165&rft.spage=113863&rft.pages=113863-&rft.artnum=113863&rft.issn=0957-4174&rft.eissn=1873-6793&rft_id=info:doi/10.1016/j.eswa.2020.113863&rft_dat=%3Cproquest_cross%3E2487170018%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2487170018&rft_id=info:pmid/&rft_els_id=S0957417420306734&rfr_iscdi=true