SARCASM DETECTION IN PERSIAN

Sarcasm is a form of communication where the individual states the opposite of what is implied. Therefore, detecting a sarcastic tone is somewhat complicated due to its ambiguous nature. On the other hand, identification of sarcasm is vital to various natural language processing tasks such as sentim...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of ICT 2020-11, Vol.20 (1), p.1-20
Hauptverfasser: Zahra Bokaee Nezhad, Mohammad Ali Deihimi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Sarcasm is a form of communication where the individual states the opposite of what is implied. Therefore, detecting a sarcastic tone is somewhat complicated due to its ambiguous nature. On the other hand, identification of sarcasm is vital to various natural language processing tasks such as sentiment analysis and text summarisation. However, research on sarcasm detection in Persian is very limited. This paper investigated the sarcasm detection technique on Persian tweets by combining deep learning-based and machine learning-based approaches. Four sets of features that cover different types of sarcasm were proposed, namely deep polarity, sentiment, part of speech, and punctuation features. These features were utilised to classify the tweets as sarcastic and nonsarcastic. In this study, the deep polarity feature was proposed by conducting a sentiment analysis using deep neural network architecture. In addition, to extract the sentiment feature, a Persian sentiment dictionary was developed, which consisted of four sentiment categories. The study also used a new Persian proverb dictionary in the preparation step to enhance the accuracy of the proposed model. The performance of the model is analysed using several standard machine learning algorithms. The results of the experiment showed that the method outperformed the baseline method and reached an accuracy of 80.82%. The study also examined the importance of each proposed feature set and evaluated its added value to the classification.
ISSN:1675-414X