The Contribution of Selected Linguistic Markers for Unsupervised Arabic Verb Sense Disambiguation

Word sense disambiguation (WSD) is the task of automatically determining the meaning of a polysemous word in a specific context. Word sense induction is the unsupervised clustering of word usages in a different context to distinguish senses and perform unsupervised WSD. Most studies consider functio...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ACM transactions on Asian and low-resource language information processing 2023-08, Vol.22 (8), p.1-23, Article 208
Hauptverfasser: Djaidri, Asma, Aliane, Hassina, Azzoune, Hamid
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Word sense disambiguation (WSD) is the task of automatically determining the meaning of a polysemous word in a specific context. Word sense induction is the unsupervised clustering of word usages in a different context to distinguish senses and perform unsupervised WSD. Most studies consider function words as stop words and delete them in the pre-processing step. However, function words can encode meaningful information that can help to improve the performance of WSD approaches. We propose in this work a novel approach to solve Arabic verb sense disambiguation that is based on a preposition-based classification that is used in an automatic word sense induction step to build sense inventories to disambiguate Arabic verbs. However, in the wake of the success of neural language models, recent works obtained encouraging results using BERT pre-trained models for English-language WSD approaches. Hence, we use contextualized word embeddings for an unsupervised Arabic WSD that is based on linguistic markers and uses sentence-BERT Transformer pre-trained models, which yields encouraging results that outperform other existing unsupervised neural AWSD approaches.
ISSN:2375-4699
2375-4702
DOI:10.1145/3605777