Count Me Too: Sentiment Analysis of Roman Sindhi Script

Social media has given voice to people around the globe. However, all voices are not counted due to the scarcity of lexical computational resources. Such resources could harness the torrent of social media text data. Computational resources for rich languages such as English are available. More are...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:SAGE open 2023-07, Vol.13 (3)
Hauptverfasser: Alvi, Muhammd Bux, Mahoto, Naeem Ahmed, Reshan, Mana Saleh Al, Unar, Mukhtiar, Elmagzoub, M. A., Shaikh, Asadullah
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Social media has given voice to people around the globe. However, all voices are not counted due to the scarcity of lexical computational resources. Such resources could harness the torrent of social media text data. Computational resources for rich languages such as English are available. More are being developed, meanwhile strengthening and enhancing the current ones. However, Roman Sindhi, a resource-poor writing style, is a phonetically rich language lacking computational resources, creating a working space for researchers. This work attempts to develop lexical sentiment resources that will help calculate the public opinion expressed in Roman Sindhi and bring their point of view into the limelight. This work is one of the initial efforts to develop lexical Roman Sindhi sentiment dictionary resources to help detect sentiment orientation in a text. Furthermore, it also developed two interfaces to leverage the lexical resources—a Roman Sindhi to English translator (RoSET) that translates a Roman Sindhi feature into an equivalent English word and a Roman Sindhi rule-based sentiment scorer (RBRS3) that assigns sentiment score to a Roman Sindhi script features. The results obtained from the developed system accommodated the bilingual dataset (Roman Sindhi + English) more adequately. An increase of 20.8% was recorded for positive sentence detection, and a 16% increase was obtained for negative sentences, whereas neutral sentences were marginalized to a lower number (59.31% decrease). The resultant system makes those public voices expressed in the Roman Sindhi script get counted, which otherwise are in vain.
ISSN:2158-2440
2158-2440
DOI:10.1177/21582440231197452