Tashaphyne0.4: a new arabic light stemmer based on rhyzome modeling approach
Stemming algorithms are crucial tools for enhancing the information retrieval process in natural language processing. This paper presents a novel Arabic light stemming algorithm called Tashaphyne0.4, the idea behind this algorithm is to extract the most precise ‘ roots ’, and ‘ stems ’ from words of...
Gespeichert in:
Veröffentlicht in: | Information retrieval (Boston) 2023-12, Vol.26 (1-2), p.14, Article 14 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Stemming algorithms are crucial tools for enhancing the information retrieval process in natural language processing. This paper presents a novel Arabic light stemming algorithm called Tashaphyne0.4, the idea behind this algorithm is to extract the most precise ‘
roots
’, and ‘
stems
’ from words of an Arabic text. Thus, the proposed algorithm acts as rooter, stemmer, and segmentation tools at the same time. Our approach involves tri-fold phases (i.e., Preparation, Stems-Extractor, and Root-Extractor). Tashaphyne0.4 has shown better results than six other stemmers (i.e., Khoja, ISRI, Motaz/Light10, Tashaphyne0.3, FARASA, and Assem stemmers). The comparison is performed using four different Arabic comprehensive-benchmarks datasets. In conclusion, our proposed stemmer achieved remarkable results and outperformed other competitive stemmers in extracting ‘
Roots
’ and ‘
Stems
’. |
---|---|
ISSN: | 1386-4564 1573-7659 |
DOI: | 10.1007/s10791-023-09429-y |