Rumour detection using graph neural network and oversampling in benchmark Twitter dataset

Recently, online social media has become a primary source for new information and misinformation or rumours. In the absence of an automatic rumour detection system the propagation of rumours has increased manifold leading to serious societal damages. In this work, we propose a novel method for build...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2022-12
Hauptverfasser:	Patel, Shaswat, Bansal, Prince, Kaur, Preeti
Format:	Artikel
Sprache:	eng
Schlagworte:	Criteria Data augmentation Datasets Graph neural networks Neural networks Oversampling
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Patel, Shaswat Bansal, Prince Kaur, Preeti
description	Recently, online social media has become a primary source for new information and misinformation or rumours. In the absence of an automatic rumour detection system the propagation of rumours has increased manifold leading to serious societal damages. In this work, we propose a novel method for building automatic rumour detection system by focusing on oversampling to alleviating the fundamental challenges of class imbalance in rumour detection task. Our oversampling method relies on contextualised data augmentation to generate synthetic samples for underrepresented classes in the dataset. The key idea exploits selection of tweets in a thread for augmentation which can be achieved by introducing a non-random selection criteria to focus the augmentation process on relevant tweets. Furthermore, we propose two graph neural networks(GNN) to model non-linear conversations on a thread. To enhance the tweet representations in our method we employed a custom feature selection technique based on state-of-the-art BERTweet model. Experiments of three publicly available datasets confirm that 1) our GNN models outperform the the current state-of-the-art classifiers by more than 20%(F1-score); 2) our oversampling technique increases the model performance by more than 9%;(F1-score) 3) focusing on relevant tweets for data augmentation via non-random selection criteria can further improve the results; and 4) our method has superior capabilities to detect rumours at very early stage.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2756547996</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2756547996</sourcerecordid><originalsourceid>FETCH-proquest_journals_27565479963</originalsourceid><addsrcrecordid>eNqNjb0OgjAURhsTE4nyDk2cSbClILPROBsWJ3OFKxShxf7I61sTH8DpDOfk-xYkYpzvkn3G2IrE1vZpmrK8YELwiFwvftTe0AYd1k5qRb2VqqWtgamjCr2BIcDN2jwpqIbqNxoL4zR8K6noHVXdjRBsNUvnMEyBA4tuQ5YPGCzGP67J9nSsDudkMvrl0bpbH45VUDdWiFxkRVnm_L_qA3H3Q20</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2756547996</pqid></control><display><type>article</type><title>Rumour detection using graph neural network and oversampling in benchmark Twitter dataset</title><source>Free E- Journals</source><creator>Patel, Shaswat ; Bansal, Prince ; Kaur, Preeti</creator><creatorcontrib>Patel, Shaswat ; Bansal, Prince ; Kaur, Preeti</creatorcontrib><description>Recently, online social media has become a primary source for new information and misinformation or rumours. In the absence of an automatic rumour detection system the propagation of rumours has increased manifold leading to serious societal damages. In this work, we propose a novel method for building automatic rumour detection system by focusing on oversampling to alleviating the fundamental challenges of class imbalance in rumour detection task. Our oversampling method relies on contextualised data augmentation to generate synthetic samples for underrepresented classes in the dataset. The key idea exploits selection of tweets in a thread for augmentation which can be achieved by introducing a non-random selection criteria to focus the augmentation process on relevant tweets. Furthermore, we propose two graph neural networks(GNN) to model non-linear conversations on a thread. To enhance the tweet representations in our method we employed a custom feature selection technique based on state-of-the-art BERTweet model. Experiments of three publicly available datasets confirm that 1) our GNN models outperform the the current state-of-the-art classifiers by more than 20%(F1-score); 2) our oversampling technique increases the model performance by more than 9%;(F1-score) 3) focusing on relevant tweets for data augmentation via non-random selection criteria can further improve the results; and 4) our method has superior capabilities to detect rumours at very early stage.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Criteria ; Data augmentation ; Datasets ; Graph neural networks ; Neural networks ; Oversampling</subject><ispartof>arXiv.org, 2022-12</ispartof><rights>2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Patel, Shaswat</creatorcontrib><creatorcontrib>Bansal, Prince</creatorcontrib><creatorcontrib>Kaur, Preeti</creatorcontrib><title>Rumour detection using graph neural network and oversampling in benchmark Twitter dataset</title><title>arXiv.org</title><description>Recently, online social media has become a primary source for new information and misinformation or rumours. In the absence of an automatic rumour detection system the propagation of rumours has increased manifold leading to serious societal damages. In this work, we propose a novel method for building automatic rumour detection system by focusing on oversampling to alleviating the fundamental challenges of class imbalance in rumour detection task. Our oversampling method relies on contextualised data augmentation to generate synthetic samples for underrepresented classes in the dataset. The key idea exploits selection of tweets in a thread for augmentation which can be achieved by introducing a non-random selection criteria to focus the augmentation process on relevant tweets. Furthermore, we propose two graph neural networks(GNN) to model non-linear conversations on a thread. To enhance the tweet representations in our method we employed a custom feature selection technique based on state-of-the-art BERTweet model. Experiments of three publicly available datasets confirm that 1) our GNN models outperform the the current state-of-the-art classifiers by more than 20%(F1-score); 2) our oversampling technique increases the model performance by more than 9%;(F1-score) 3) focusing on relevant tweets for data augmentation via non-random selection criteria can further improve the results; and 4) our method has superior capabilities to detect rumours at very early stage.</description><subject>Criteria</subject><subject>Data augmentation</subject><subject>Datasets</subject><subject>Graph neural networks</subject><subject>Neural networks</subject><subject>Oversampling</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNqNjb0OgjAURhsTE4nyDk2cSbClILPROBsWJ3OFKxShxf7I61sTH8DpDOfk-xYkYpzvkn3G2IrE1vZpmrK8YELwiFwvftTe0AYd1k5qRb2VqqWtgamjCr2BIcDN2jwpqIbqNxoL4zR8K6noHVXdjRBsNUvnMEyBA4tuQ5YPGCzGP67J9nSsDudkMvrl0bpbH45VUDdWiFxkRVnm_L_qA3H3Q20</recordid><startdate>20221220</startdate><enddate>20221220</enddate><creator>Patel, Shaswat</creator><creator>Bansal, Prince</creator><creator>Kaur, Preeti</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20221220</creationdate><title>Rumour detection using graph neural network and oversampling in benchmark Twitter dataset</title><author>Patel, Shaswat ; Bansal, Prince ; Kaur, Preeti</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_27565479963</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Criteria</topic><topic>Data augmentation</topic><topic>Datasets</topic><topic>Graph neural networks</topic><topic>Neural networks</topic><topic>Oversampling</topic><toplevel>online_resources</toplevel><creatorcontrib>Patel, Shaswat</creatorcontrib><creatorcontrib>Bansal, Prince</creatorcontrib><creatorcontrib>Kaur, Preeti</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Patel, Shaswat</au><au>Bansal, Prince</au><au>Kaur, Preeti</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Rumour detection using graph neural network and oversampling in benchmark Twitter dataset</atitle><jtitle>arXiv.org</jtitle><date>2022-12-20</date><risdate>2022</risdate><eissn>2331-8422</eissn><abstract>Recently, online social media has become a primary source for new information and misinformation or rumours. In the absence of an automatic rumour detection system the propagation of rumours has increased manifold leading to serious societal damages. In this work, we propose a novel method for building automatic rumour detection system by focusing on oversampling to alleviating the fundamental challenges of class imbalance in rumour detection task. Our oversampling method relies on contextualised data augmentation to generate synthetic samples for underrepresented classes in the dataset. The key idea exploits selection of tweets in a thread for augmentation which can be achieved by introducing a non-random selection criteria to focus the augmentation process on relevant tweets. Furthermore, we propose two graph neural networks(GNN) to model non-linear conversations on a thread. To enhance the tweet representations in our method we employed a custom feature selection technique based on state-of-the-art BERTweet model. Experiments of three publicly available datasets confirm that 1) our GNN models outperform the the current state-of-the-art classifiers by more than 20%(F1-score); 2) our oversampling technique increases the model performance by more than 9%;(F1-score) 3) focusing on relevant tweets for data augmentation via non-random selection criteria can further improve the results; and 4) our method has superior capabilities to detect rumours at very early stage.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2022-12
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2756547996
source	Free E- Journals
subjects	Criteria Data augmentation Datasets Graph neural networks Neural networks Oversampling
title	Rumour detection using graph neural network and oversampling in benchmark Twitter dataset
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-30T11%3A21%3A17IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Rumour%20detection%20using%20graph%20neural%20network%20and%20oversampling%20in%20benchmark%20Twitter%20dataset&rft.jtitle=arXiv.org&rft.au=Patel,%20Shaswat&rft.date=2022-12-20&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2756547996%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2756547996&rft_id=info:pmid/&rfr_iscdi=true