Count Me Too: Sentiment Analysis of Roman Sindhi Script
Social media has given voice to people around the globe. However, all voices are not counted due to the scarcity of lexical computational resources. Such resources could harness the torrent of social media text data. Computational resources for rich languages such as English are available. More are...
Gespeichert in:
Veröffentlicht in: | SAGE open 2023-07, Vol.13 (3) |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Social media has given voice to people around the globe. However, all voices are not counted due to the scarcity of lexical computational resources. Such resources could harness the torrent of social media text data. Computational resources for rich languages such as English are available. More are being developed, meanwhile strengthening and enhancing the current ones. However, Roman Sindhi, a resource-poor writing style, is a phonetically rich language lacking computational resources, creating a working space for researchers. This work attempts to develop lexical sentiment resources that will help calculate the public opinion expressed in Roman Sindhi and bring their point of view into the limelight. This work is one of the initial efforts to develop lexical Roman Sindhi sentiment dictionary resources to help detect sentiment orientation in a text. Furthermore, it also developed two interfaces to leverage the lexical resources—a Roman Sindhi to English translator (RoSET) that translates a Roman Sindhi feature into an equivalent English word and a Roman Sindhi rule-based sentiment scorer (RBRS3) that assigns sentiment score to a Roman Sindhi script features. The results obtained from the developed system accommodated the bilingual dataset (Roman Sindhi + English) more adequately. An increase of 20.8% was recorded for positive sentence detection, and a 16% increase was obtained for negative sentences, whereas neutral sentences were marginalized to a lower number (59.31% decrease). The resultant system makes those public voices expressed in the Roman Sindhi script get counted, which otherwise are in vain. |
---|---|
ISSN: | 2158-2440 2158-2440 |
DOI: | 10.1177/21582440231197452 |