Personality Identification from Social Media Using Ensemble Bert and Roberta

Social media growth was fast because many people used it to express their feelings, share information, and interact with others. With the growth of social media, many researchers are interested in using social media data to conduct research about personality identification. The identification result...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Informatica (Ljubljana) 2023-12, Vol.47 (4), p.537-544
Hauptverfasser: Tsani, Eggi Farkhan, Suhartono, Derwin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Social media growth was fast because many people used it to express their feelings, share information, and interact with others. With the growth of social media, many researchers are interested in using social media data to conduct research about personality identification. The identification result can be used as a parameter to screen candidate attitudes in the company's recruitment process. Some approaches were used for research about personality; one of the most popular is the Big Five Personality Model. In this research, an ensemble model between BERT and RoBERTa was introduced for personality prediction from the Twitter and Youtube datasets. The data augmentation method also introduces to handling the imbalance class for each dataset. Pre-trained model BERT and RoBERTa was used as the feature extraction method and modeling process. To predict each trait in the Big Five Personality, the voting ensemble from BERT and RoBERTa achieved an average f1 score 0,730 for Twitter dataset and 0,741 for Youtube dataset. Using the proposed model, we conclude that data augmentation can increase average performance compared to the model without data augmentation process.
ISSN:0350-5596
1854-3871
DOI:10.31449/inf.v47i4.4771