Leveraging ParsBERT for cross-domain polarity sentiment classification of Persian social media comments

Sentiment analysis is the computational study of the emotions, attitudes and opinions of humans through the extraction of meaningful information. Social media platforms that allow consumers to share and publish content, are enriched with opinionating information that many analytical researches are c...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Multimedia tools and applications 2024, Vol.83 (4), p.10677-10694
Hauptverfasser:	Panahandeh Nigjeh, Mahnaz, Ghanbari, Shirin
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Communication Networks Computer Science Data Structures and Information Theory Multimedia Information Systems Special Purpose and Application-Based Systems
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Sentiment analysis is the computational study of the emotions, attitudes and opinions of humans through the extraction of meaningful information. Social media platforms that allow consumers to share and publish content, are enriched with opinionating information that many analytical researches are currently, however, limited to a specific domain. This research presents an architecture to analyze a limited resource language, Persian language, and focuses on the analysis of social media, consisting of informal comments across different domains. The proposed model applies a transformer-based model, ParsBERT, to classify the sentiments of social media comments. Since social media comments have different domains, it is necessary for the proposed model to classify sentiments of comments in different domains. ParsBERT has been fine-tuned on a Persian corpus that has been generated for the purpose of this study. The generated corpus has been gathered from 28,710 Instagram comments in different topic domains and have been labeled as either negative or positive comments. The proposed model has been evaluated based on different test data belonging to different time-periods and topic domains and results have been compared with recent methods for the task of sentiment analysis for three different scenarios. Results show that when the training and test data are from different domains an accuracy of 68% is achieved, which is higher than other shallow methodologies and deep learning methods for determining the sentiments of social media comments in different domains.
ISSN:	1380-7501 1573-7721
DOI:	10.1007/s11042-023-16067-5