Spammer Behavior Analysis and Detection in User Generated Content on Social Networks

Spam content is surging with an explosive increase of user generated content (UGC) on the Internet. Spammers often insert popular keywords or simply copy and paste recent articles from the Web with spam links inserted, attempting to disable content-based detection. In order to effectively detect spa...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Enhua Tan, Lei Guo, Songqing Chen, Xiaodong Zhang, Yihong Zhao
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Bars Blogs Feature extraction Runtime Software Unsolicited electronic mail
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	314
container_issue
container_start_page	305
container_title
container_volume
creator	Enhua Tan Lei Guo Songqing Chen Xiaodong Zhang Yihong Zhao
description	Spam content is surging with an explosive increase of user generated content (UGC) on the Internet. Spammers often insert popular keywords or simply copy and paste recent articles from the Web with spam links inserted, attempting to disable content-based detection. In order to effectively detect spam in user generated content, we first conduct a comprehensive analysis of spamming activities on a large commercial UGC site in 325 days covering over 6 million posts and nearly 400 thousand users. Our analysis shows that UGC spammers exhibit unique non-textual patterns, such as posting activities, advertised spam link metrics, and spam hosting behaviors. Based on these non-textual features, we show via several classification methods that a high detection rate could be achieved offline. These results further motivate us to develop a runtime scheme, BARS, to detect spam posts based on these spamming patterns. The experimental results demonstrate the effectiveness and robustness of BARS.
doi_str_mv	10.1109/ICDCS.2012.40
format	Conference Proceeding
fullrecord	<record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_6258003</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6258003</ieee_id><sourcerecordid>6258003</sourcerecordid><originalsourceid>FETCH-LOGICAL-i175t-dc0a57453c11ad52fbbafd3ac042e834e2342b23fe0bf104eadfb80979f186a83</originalsourceid><addsrcrecordid>eNotjDtPwzAURs1LopSOTCz-Ayn3-hHbY0mhVKpgaDtXTnItDG1SxRaIf08l-JYznKOPsTuEKSK4h2U1r9ZTASimCs7YxBkLpnRalVbbczYS2ujCKsQLdoNKGwPCaXfJRgilLEonzDWbpPQBpxmLKOyIbdZHfzjQwB_p3X_FfuCzzu9_Ukzcdy2fU6Ymx77jsePbdOoW1NHgM7W86rtMXeYnue6b6Pf8lfJ3P3ymW3YV_D7R5J9jtn1-2lQvxeptsaxmqyKi0bloG_DaKC0bRN9qEerah1b6BpQgKxUJqUQtZCCoA4Ii34bagjMuoC29lWN2__cbiWh3HOLBDz-7UmgLIOUvZYtU5Q</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Spammer Behavior Analysis and Detection in User Generated Content on Social Networks</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Enhua Tan ; Lei Guo ; Songqing Chen ; Xiaodong Zhang ; Yihong Zhao</creator><creatorcontrib>Enhua Tan ; Lei Guo ; Songqing Chen ; Xiaodong Zhang ; Yihong Zhao</creatorcontrib><description>Spam content is surging with an explosive increase of user generated content (UGC) on the Internet. Spammers often insert popular keywords or simply copy and paste recent articles from the Web with spam links inserted, attempting to disable content-based detection. In order to effectively detect spam in user generated content, we first conduct a comprehensive analysis of spamming activities on a large commercial UGC site in 325 days covering over 6 million posts and nearly 400 thousand users. Our analysis shows that UGC spammers exhibit unique non-textual patterns, such as posting activities, advertised spam link metrics, and spam hosting behaviors. Based on these non-textual features, we show via several classification methods that a high detection rate could be achieved offline. These results further motivate us to develop a runtime scheme, BARS, to detect spam posts based on these spamming patterns. The experimental results demonstrate the effectiveness and robustness of BARS.</description><identifier>ISSN: 1063-6927</identifier><identifier>ISBN: 1457702959</identifier><identifier>ISBN: 9781457702952</identifier><identifier>EISSN: 2575-8411</identifier><identifier>EISBN: 9780769546858</identifier><identifier>EISBN: 0769546854</identifier><identifier>DOI: 10.1109/ICDCS.2012.40</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Bars ; Blogs ; Feature extraction ; Runtime ; Software ; Unsolicited electronic mail</subject><ispartof>2012 IEEE 32nd International Conference on Distributed Computing Systems, 2012, p.305-314</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6258003$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2057,27924,54919</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6258003$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Enhua Tan</creatorcontrib><creatorcontrib>Lei Guo</creatorcontrib><creatorcontrib>Songqing Chen</creatorcontrib><creatorcontrib>Xiaodong Zhang</creatorcontrib><creatorcontrib>Yihong Zhao</creatorcontrib><title>Spammer Behavior Analysis and Detection in User Generated Content on Social Networks</title><title>2012 IEEE 32nd International Conference on Distributed Computing Systems</title><addtitle>ICDSC</addtitle><description>Spam content is surging with an explosive increase of user generated content (UGC) on the Internet. Spammers often insert popular keywords or simply copy and paste recent articles from the Web with spam links inserted, attempting to disable content-based detection. In order to effectively detect spam in user generated content, we first conduct a comprehensive analysis of spamming activities on a large commercial UGC site in 325 days covering over 6 million posts and nearly 400 thousand users. Our analysis shows that UGC spammers exhibit unique non-textual patterns, such as posting activities, advertised spam link metrics, and spam hosting behaviors. Based on these non-textual features, we show via several classification methods that a high detection rate could be achieved offline. These results further motivate us to develop a runtime scheme, BARS, to detect spam posts based on these spamming patterns. The experimental results demonstrate the effectiveness and robustness of BARS.</description><subject>Bars</subject><subject>Blogs</subject><subject>Feature extraction</subject><subject>Runtime</subject><subject>Software</subject><subject>Unsolicited electronic mail</subject><issn>1063-6927</issn><issn>2575-8411</issn><isbn>1457702959</isbn><isbn>9781457702952</isbn><isbn>9780769546858</isbn><isbn>0769546854</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2012</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNotjDtPwzAURs1LopSOTCz-Ayn3-hHbY0mhVKpgaDtXTnItDG1SxRaIf08l-JYznKOPsTuEKSK4h2U1r9ZTASimCs7YxBkLpnRalVbbczYS2ujCKsQLdoNKGwPCaXfJRgilLEonzDWbpPQBpxmLKOyIbdZHfzjQwB_p3X_FfuCzzu9_Ukzcdy2fU6Ymx77jsePbdOoW1NHgM7W86rtMXeYnue6b6Pf8lfJ3P3ymW3YV_D7R5J9jtn1-2lQvxeptsaxmqyKi0bloG_DaKC0bRN9qEerah1b6BpQgKxUJqUQtZCCoA4Ii34bagjMuoC29lWN2__cbiWh3HOLBDz-7UmgLIOUvZYtU5Q</recordid><startdate>201206</startdate><enddate>201206</enddate><creator>Enhua Tan</creator><creator>Lei Guo</creator><creator>Songqing Chen</creator><creator>Xiaodong Zhang</creator><creator>Yihong Zhao</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>201206</creationdate><title>Spammer Behavior Analysis and Detection in User Generated Content on Social Networks</title><author>Enhua Tan ; Lei Guo ; Songqing Chen ; Xiaodong Zhang ; Yihong Zhao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i175t-dc0a57453c11ad52fbbafd3ac042e834e2342b23fe0bf104eadfb80979f186a83</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Bars</topic><topic>Blogs</topic><topic>Feature extraction</topic><topic>Runtime</topic><topic>Software</topic><topic>Unsolicited electronic mail</topic><toplevel>online_resources</toplevel><creatorcontrib>Enhua Tan</creatorcontrib><creatorcontrib>Lei Guo</creatorcontrib><creatorcontrib>Songqing Chen</creatorcontrib><creatorcontrib>Xiaodong Zhang</creatorcontrib><creatorcontrib>Yihong Zhao</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Enhua Tan</au><au>Lei Guo</au><au>Songqing Chen</au><au>Xiaodong Zhang</au><au>Yihong Zhao</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Spammer Behavior Analysis and Detection in User Generated Content on Social Networks</atitle><btitle>2012 IEEE 32nd International Conference on Distributed Computing Systems</btitle><stitle>ICDSC</stitle><date>2012-06</date><risdate>2012</risdate><spage>305</spage><epage>314</epage><pages>305-314</pages><issn>1063-6927</issn><eissn>2575-8411</eissn><isbn>1457702959</isbn><isbn>9781457702952</isbn><eisbn>9780769546858</eisbn><eisbn>0769546854</eisbn><coden>IEEPAD</coden><abstract>Spam content is surging with an explosive increase of user generated content (UGC) on the Internet. Spammers often insert popular keywords or simply copy and paste recent articles from the Web with spam links inserted, attempting to disable content-based detection. In order to effectively detect spam in user generated content, we first conduct a comprehensive analysis of spamming activities on a large commercial UGC site in 325 days covering over 6 million posts and nearly 400 thousand users. Our analysis shows that UGC spammers exhibit unique non-textual patterns, such as posting activities, advertised spam link metrics, and spam hosting behaviors. Based on these non-textual features, we show via several classification methods that a high detection rate could be achieved offline. These results further motivate us to develop a runtime scheme, BARS, to detect spam posts based on these spamming patterns. The experimental results demonstrate the effectiveness and robustness of BARS.</abstract><pub>IEEE</pub><doi>10.1109/ICDCS.2012.40</doi><tpages>10</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1063-6927
ispartof	2012 IEEE 32nd International Conference on Distributed Computing Systems, 2012, p.305-314
issn	1063-6927 2575-8411
language	eng
recordid	cdi_ieee_primary_6258003
source	IEEE Electronic Library (IEL) Conference Proceedings
subjects	Bars Blogs Feature extraction Runtime Software Unsolicited electronic mail
title	Spammer Behavior Analysis and Detection in User Generated Content on Social Networks
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T21%3A10%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Spammer%20Behavior%20Analysis%20and%20Detection%20in%20User%20Generated%20Content%20on%20Social%20Networks&rft.btitle=2012%20IEEE%2032nd%20International%20Conference%20on%20Distributed%20Computing%20Systems&rft.au=Enhua%20Tan&rft.date=2012-06&rft.spage=305&rft.epage=314&rft.pages=305-314&rft.issn=1063-6927&rft.eissn=2575-8411&rft.isbn=1457702959&rft.isbn_list=9781457702952&rft.coden=IEEPAD&rft_id=info:doi/10.1109/ICDCS.2012.40&rft_dat=%3Cieee_6IE%3E6258003%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9780769546858&rft.eisbn_list=0769546854&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6258003&rfr_iscdi=true