Contextual information usage for the enhancement of basic emotion classification in a weakly labelled social network dataset in Spanish

Basic emotion classification is one of the main tasks of Sentiment Analysis usually performed by using several machine learning techniques. One of the main issues in Sentiment Analysis is the availability of tagged resources to properly train supervised classification algorithms. This is of particul...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Multimedia tools and applications 2023-03, Vol.82 (7), p.9871-9890
Hauptverfasser:	Tessore, Juan Pablo, Esnaola, Leonardo Martín, Ramón, Hugo Dionisio, Lanzarini, Laura, Baldassarri, Sandra
Format:	Artikel
Sprache:	eng
Schlagworte:	1222: Intelligent Multimedia Data Analytics and Computing Algorithms Availability Classification Computer Communication Networks Computer Science Data mining Data Structures and Information Theory Datasets Emotions Machine learning Multimedia Information Systems Non-English languages Sentiment analysis Social networks Special Purpose and Application-Based Systems
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Basic emotion classification is one of the main tasks of Sentiment Analysis usually performed by using several machine learning techniques. One of the main issues in Sentiment Analysis is the availability of tagged resources to properly train supervised classification algorithms. This is of particular concern in languages other than English, such as Spanish, where scarcity of these resources is the norm. In addition, most basic emotion datasets available in Spanish are rather small, containing a few hundred (or thousand) samples. Usually, the samples only contain a short text (frequently a comment) and a tag (the basic emotion), omitting crucial contextual information that may help to improve the classification task results. In this paper, the impact of using contextual information is measured on a recently published Spanish basic emotion dataset and the baseline architecture proposed in the Semantic Evaluation 2019 competition. This particular dataset has two main advantages for this paper. First, it was compiled using Distant Supervision and as a result it contains several hundred thousand samples. Secondly, the authors included valuable contextual information for each comment. The results show that contextual information, such as news headlines or summaries, helps improve the classification accuracy over a dataset of distantly supervised basic emotion labelled comments.
ISSN:	1380-7501 1573-7721
DOI:	10.1007/s11042-022-13750-x