RumorLLM: A Rumor Large Language Model-Based Fake-News-Detection Data-Augmentation Approach

With the rapid development of the Internet and social media, false information, rumors, and misleading content have become pervasive, posing significant threats to public opinion and social stability, and even causing serious societal harm. This paper introduces a novel solution to address the chall...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Applied sciences 2024-04, Vol.14 (8), p.3532
Hauptverfasser:	Lai, Jianqiao, Yang, Xinran, Luo, Wenyue, Zhou, Linjiang, Li, Langchen, Wang, Yongqi, Shi, Xiaochuan
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Analysis Artificial intelligence category imbalance Classification Computational linguistics data augmentation Datasets Deep learning Disinformation Efficiency fake-news detection False information Gossip Language Language processing large language models Machine learning Natural language interfaces Neural networks rumor generation Semantics Social networks Writing
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	With the rapid development of the Internet and social media, false information, rumors, and misleading content have become pervasive, posing significant threats to public opinion and social stability, and even causing serious societal harm. This paper introduces a novel solution to address the challenges of fake news detection, presenting the “Rumor Large Language Models” (RumorLLM), a large language model finetuned with rumor writing styles and content. The key contributions include the development of RumorLLM and a data-augmentation method for small categories, effectively mitigating the issue of category imbalance in real-world fake-news datasets. Experimental results on the BuzzFeed and PolitiFact datasets demonstrate the superiority of the proposed model over baseline methods, particularly in F1 score and AUC-ROC. The model’s robust performance highlights its effectiveness in handling imbalanced datasets and provides a promising solution to the pressing issue of false-information proliferation.
ISSN:	2076-3417 2076-3417
DOI:	10.3390/app14083532