Covid based question criticality prediction with domain adaptive BERT embeddings

Covid-19 has resulted in an infodemic with millions of people posting their queries across social media, forums and chatbots. The queries are found to exhibit heterogeneity with respect to Emotions, Context, Dynamics and influence of Misinformation. Given this context, it is of utmost importance to...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Engineering applications of artificial intelligence 2024-06, Vol.132, p.107913, Article 107913
Hauptverfasser: Jeyaraj, Shiney, T., Raghuveera
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Covid-19 has resulted in an infodemic with millions of people posting their queries across social media, forums and chatbots. The queries are found to exhibit heterogeneity with respect to Emotions, Context, Dynamics and influence of Misinformation. Given this context, it is of utmost importance to filter the critical questions from the overall information available. We propose a novel framework for predicting the criticality of a question based on the 4 dimensions namely Emotion, Topic, Mutability and Veracity. The key components of the framework are i. An emotion classifier, utilizing covid specific BERT embeddings with a micro and macro F1−Score of 0.7191 and 0.6630 respectively on Covid-Q dataset. ii. A topic identifier with covid specific BERT embeddings resulting in a Micro and Macro F1−Score of 0.6721 and 0.6880 respectively on Covid-Q dataset. It outperforms an existing classifier built with domain independent BERT by 11% iii. Quantifying the mutability of the answers to questions across time with a two step approach comprising of T5 model based question generation followed by Sentence-BERT based semantic search and iv. Ascertaining veracity of the questions with a Sentence-BERT based semantic search. On Regression Analysis, it is found that p-Values are less than 0.05 for each of the chosen variables validating their statistical significance towards overall question criticality prediction.
ISSN:0952-1976
DOI:10.1016/j.engappai.2024.107913