Find right countenance for your input—Improving automatic emoticon recommendation system with distributed representations
•Adapted pre-trained model (i.e. BERT, ELMo, and Word2vec) learned from Japanese data to our emoticon recommendation system.•Empirically compared our proposed systems with baseline methods that learned surface patterns of texts and emoticons.•Compared pre-trained models learned from different text d...
Gespeichert in:
Veröffentlicht in: | Information processing & management 2021-01, Vol.58 (1), p.102414, Article 102414 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •Adapted pre-trained model (i.e. BERT, ELMo, and Word2vec) learned from Japanese data to our emoticon recommendation system.•Empirically compared our proposed systems with baseline methods that learned surface patterns of texts and emoticons.•Compared pre-trained models learned from different text domains to observe the difference in recommendation results.
Emoticons are popularly used to express user’s feelings in social media, blogs, and instant messaging. However, the number of emoticons existing in emoticon dictionaries which users select from is large, thus, it is difficult for users to find the desired emoticon that matches the content of their messages. In this paper, we propose a method that supports users’ emoticon selection by reordering 167 unique emoticons in the emoticon dictionary by applying pre-trained models learned from large data in Japanese. We evaluated whether adapting a pre-trained model to our emoticon recommendation system achieves better results than just learning surface patterns of text and emoticon. We collected sets of sentences and emoticons in Japanese from the Internet and pre-trained models (i.e. Word2vec, ELMo, and BERT) that learned from large Japanese textual data and used deep learning techniques such as BiLSTM and fine-tuning for learning. We confirmed that fine-tuning our data with BERT achieved the best recommendation accuracy of 52.98%, recommending the correct emoticon within the top 25 (top 15%) of the emoticons. Moreover, we confirmed our intuition that widely used Wikipedia-based pre-trained models are not the best voice for the facemark recommendations. |
---|---|
ISSN: | 0306-4573 1873-5371 |
DOI: | 10.1016/j.ipm.2020.102414 |