Clues in Tweets: Twitter-Guided Discovery and Analysis of SMS Spam

With its critical role in business and service delivery through mobile devices, SMS (Short Message Service) has long been abused for spamming, which is still on the rise today possibly due to the emergence of A2P bulk messaging. The effort to control SMS spam has been hampered by the lack of up-to-d...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2022-04
Hauptverfasser:	Tang, Siyuan, Xianghang Mi, Li, Ying, Wang, XiaoFeng, Chen, Kai
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Cryptography and Security Datasets Depth measurement Electronic devices Evaluation Messages Service introduction Short message service Spamming Text messaging
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Tang, Siyuan Xianghang Mi Li, Ying Wang, XiaoFeng Chen, Kai
description	With its critical role in business and service delivery through mobile devices, SMS (Short Message Service) has long been abused for spamming, which is still on the rise today possibly due to the emergence of A2P bulk messaging. The effort to control SMS spam has been hampered by the lack of up-to-date information about illicit activities. In our research, we proposed a novel solution to collect recent SMS spam data, at a large scale, from Twitter, where users voluntarily report the spam messages they receive. For this purpose, we designed and implemented SpamHunter, an automated pipeline to discover SMS spam reporting tweets and extract message content from the attached screenshots. Leveraging SpamHunter, we collected from Twitter a dataset of 21,918 SMS spam messages in 75 languages, spanning over four years. To our best knowledge, this is the largest SMS spam dataset ever made public. More importantly, SpamHunter enables us to continuously monitor emerging SMS spam messages, which facilitates the ongoing effort to mitigate SMS spamming. We also performed an in-depth measurement study that sheds light on the new trends in the spammer's strategies, infrastructure and spam campaigns. We also utilized our spam SMS data to evaluate the robustness of the spam countermeasures put in place by the SMS ecosystem, including anti-spam services, bulk SMS services, and text messaging apps. Our evaluation shows that such protection cannot effectively handle those spam samples: either introducing significant false positives or missing a large number of newly reported spam messages.
doi_str_mv	10.48550/arxiv.2204.01233
format	Article
fullrecord	<record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_2204_01233</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2647056976</sourcerecordid><originalsourceid>FETCH-LOGICAL-a953-2decd504611feec01f51cb9fd376feb90c98ce749a5128972dae376e9f8927d73</originalsourceid><addsrcrecordid>eNotjz1PwzAYhC0kJKrSH8CEJeYEf8RxzFYCLUhFDMkeufFryVWaBDsp5N8TWpa74U6nexC6oyROMiHIo_Y_7hQzRpKYUMb5FVrMSqMsYewGrUI4EEJYKpkQfIGe82aEgF2Ly2-AITzN7oYBfLQdnQGDX1youxP4CevW4HWrmym4gDuLi48CF70-3qJrq5sAq39fonLzWuZv0e5z-56vd5FWgkfMQG0ESVJKLUBNqBW03itruEwt7BWpVVaDTJQWlGVKMqNhjkDZTDFpJF-i-8vsGbDqvTtqP1V_oNUZdG48XBq9775mqqE6dKOfH4eKpYkkIlUy5b8ORFTC</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2647056976</pqid></control><display><type>article</type><title>Clues in Tweets: Twitter-Guided Discovery and Analysis of SMS Spam</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Tang, Siyuan ; Xianghang Mi ; Li, Ying ; Wang, XiaoFeng ; Chen, Kai</creator><creatorcontrib>Tang, Siyuan ; Xianghang Mi ; Li, Ying ; Wang, XiaoFeng ; Chen, Kai</creatorcontrib><description>With its critical role in business and service delivery through mobile devices, SMS (Short Message Service) has long been abused for spamming, which is still on the rise today possibly due to the emergence of A2P bulk messaging. The effort to control SMS spam has been hampered by the lack of up-to-date information about illicit activities. In our research, we proposed a novel solution to collect recent SMS spam data, at a large scale, from Twitter, where users voluntarily report the spam messages they receive. For this purpose, we designed and implemented SpamHunter, an automated pipeline to discover SMS spam reporting tweets and extract message content from the attached screenshots. Leveraging SpamHunter, we collected from Twitter a dataset of 21,918 SMS spam messages in 75 languages, spanning over four years. To our best knowledge, this is the largest SMS spam dataset ever made public. More importantly, SpamHunter enables us to continuously monitor emerging SMS spam messages, which facilitates the ongoing effort to mitigate SMS spamming. We also performed an in-depth measurement study that sheds light on the new trends in the spammer's strategies, infrastructure and spam campaigns. We also utilized our spam SMS data to evaluate the robustness of the spam countermeasures put in place by the SMS ecosystem, including anti-spam services, bulk SMS services, and text messaging apps. Our evaluation shows that such protection cannot effectively handle those spam samples: either introducing significant false positives or missing a large number of newly reported spam messages.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2204.01233</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Computer Science - Cryptography and Security ; Datasets ; Depth measurement ; Electronic devices ; Evaluation ; Messages ; Service introduction ; Short message service ; Spamming ; Text messaging</subject><ispartof>arXiv.org, 2022-04</ispartof><rights>2022. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,784,885,27925</link.rule.ids><backlink>$$Uhttps://doi.org/10.48550/arXiv.2204.01233$$DView paper in arXiv$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.1145/3548606.3559351$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink></links><search><creatorcontrib>Tang, Siyuan</creatorcontrib><creatorcontrib>Xianghang Mi</creatorcontrib><creatorcontrib>Li, Ying</creatorcontrib><creatorcontrib>Wang, XiaoFeng</creatorcontrib><creatorcontrib>Chen, Kai</creatorcontrib><title>Clues in Tweets: Twitter-Guided Discovery and Analysis of SMS Spam</title><title>arXiv.org</title><description>With its critical role in business and service delivery through mobile devices, SMS (Short Message Service) has long been abused for spamming, which is still on the rise today possibly due to the emergence of A2P bulk messaging. The effort to control SMS spam has been hampered by the lack of up-to-date information about illicit activities. In our research, we proposed a novel solution to collect recent SMS spam data, at a large scale, from Twitter, where users voluntarily report the spam messages they receive. For this purpose, we designed and implemented SpamHunter, an automated pipeline to discover SMS spam reporting tweets and extract message content from the attached screenshots. Leveraging SpamHunter, we collected from Twitter a dataset of 21,918 SMS spam messages in 75 languages, spanning over four years. To our best knowledge, this is the largest SMS spam dataset ever made public. More importantly, SpamHunter enables us to continuously monitor emerging SMS spam messages, which facilitates the ongoing effort to mitigate SMS spamming. We also performed an in-depth measurement study that sheds light on the new trends in the spammer's strategies, infrastructure and spam campaigns. We also utilized our spam SMS data to evaluate the robustness of the spam countermeasures put in place by the SMS ecosystem, including anti-spam services, bulk SMS services, and text messaging apps. Our evaluation shows that such protection cannot effectively handle those spam samples: either introducing significant false positives or missing a large number of newly reported spam messages.</description><subject>Computer Science - Cryptography and Security</subject><subject>Datasets</subject><subject>Depth measurement</subject><subject>Electronic devices</subject><subject>Evaluation</subject><subject>Messages</subject><subject>Service introduction</subject><subject>Short message service</subject><subject>Spamming</subject><subject>Text messaging</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GOX</sourceid><recordid>eNotjz1PwzAYhC0kJKrSH8CEJeYEf8RxzFYCLUhFDMkeufFryVWaBDsp5N8TWpa74U6nexC6oyROMiHIo_Y_7hQzRpKYUMb5FVrMSqMsYewGrUI4EEJYKpkQfIGe82aEgF2Ly2-AITzN7oYBfLQdnQGDX1youxP4CevW4HWrmym4gDuLi48CF70-3qJrq5sAq39fonLzWuZv0e5z-56vd5FWgkfMQG0ESVJKLUBNqBW03itruEwt7BWpVVaDTJQWlGVKMqNhjkDZTDFpJF-i-8vsGbDqvTtqP1V_oNUZdG48XBq9775mqqE6dKOfH4eKpYkkIlUy5b8ORFTC</recordid><startdate>20220404</startdate><enddate>20220404</enddate><creator>Tang, Siyuan</creator><creator>Xianghang Mi</creator><creator>Li, Ying</creator><creator>Wang, XiaoFeng</creator><creator>Chen, Kai</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20220404</creationdate><title>Clues in Tweets: Twitter-Guided Discovery and Analysis of SMS Spam</title><author>Tang, Siyuan ; Xianghang Mi ; Li, Ying ; Wang, XiaoFeng ; Chen, Kai</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a953-2decd504611feec01f51cb9fd376feb90c98ce749a5128972dae376e9f8927d73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Computer Science - Cryptography and Security</topic><topic>Datasets</topic><topic>Depth measurement</topic><topic>Electronic devices</topic><topic>Evaluation</topic><topic>Messages</topic><topic>Service introduction</topic><topic>Short message service</topic><topic>Spamming</topic><topic>Text messaging</topic><toplevel>online_resources</toplevel><creatorcontrib>Tang, Siyuan</creatorcontrib><creatorcontrib>Xianghang Mi</creatorcontrib><creatorcontrib>Li, Ying</creatorcontrib><creatorcontrib>Wang, XiaoFeng</creatorcontrib><creatorcontrib>Chen, Kai</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>arXiv Computer Science</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Tang, Siyuan</au><au>Xianghang Mi</au><au>Li, Ying</au><au>Wang, XiaoFeng</au><au>Chen, Kai</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Clues in Tweets: Twitter-Guided Discovery and Analysis of SMS Spam</atitle><jtitle>arXiv.org</jtitle><date>2022-04-04</date><risdate>2022</risdate><eissn>2331-8422</eissn><abstract>With its critical role in business and service delivery through mobile devices, SMS (Short Message Service) has long been abused for spamming, which is still on the rise today possibly due to the emergence of A2P bulk messaging. The effort to control SMS spam has been hampered by the lack of up-to-date information about illicit activities. In our research, we proposed a novel solution to collect recent SMS spam data, at a large scale, from Twitter, where users voluntarily report the spam messages they receive. For this purpose, we designed and implemented SpamHunter, an automated pipeline to discover SMS spam reporting tweets and extract message content from the attached screenshots. Leveraging SpamHunter, we collected from Twitter a dataset of 21,918 SMS spam messages in 75 languages, spanning over four years. To our best knowledge, this is the largest SMS spam dataset ever made public. More importantly, SpamHunter enables us to continuously monitor emerging SMS spam messages, which facilitates the ongoing effort to mitigate SMS spamming. We also performed an in-depth measurement study that sheds light on the new trends in the spammer's strategies, infrastructure and spam campaigns. We also utilized our spam SMS data to evaluate the robustness of the spam countermeasures put in place by the SMS ecosystem, including anti-spam services, bulk SMS services, and text messaging apps. Our evaluation shows that such protection cannot effectively handle those spam samples: either introducing significant false positives or missing a large number of newly reported spam messages.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2204.01233</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2022-04
issn	2331-8422
language	eng
recordid	cdi_arxiv_primary_2204_01233
source	arXiv.org; Free E- Journals
subjects	Computer Science - Cryptography and Security Datasets Depth measurement Electronic devices Evaluation Messages Service introduction Short message service Spamming Text messaging
title	Clues in Tweets: Twitter-Guided Discovery and Analysis of SMS Spam
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T00%3A30%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Clues%20in%20Tweets:%20Twitter-Guided%20Discovery%20and%20Analysis%20of%20SMS%20Spam&rft.jtitle=arXiv.org&rft.au=Tang,%20Siyuan&rft.date=2022-04-04&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2204.01233&rft_dat=%3Cproquest_arxiv%3E2647056976%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2647056976&rft_id=info:pmid/&rfr_iscdi=true