An Upgraded Approach for Identifying Partially Reduplicated Forms in Bengali Text

This paper presents a concise methodology for the detection of partially reduplicated Multi-Word Expressions (MWEs) in Bengali texts. The entire process of identifying such reduplicated forms is carried out in two distinct phases, each contributing to the accuracy and effectiveness of the overall ap...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:SN computer science 2024-09, Vol.5 (7), p.892, Article 892
Hauptverfasser: Barman, Abhijit, Saha, Diganta, Pal, Alok Ranjan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper presents a concise methodology for the detection of partially reduplicated Multi-Word Expressions (MWEs) in Bengali texts. The entire process of identifying such reduplicated forms is carried out in two distinct phases, each contributing to the accuracy and effectiveness of the overall approach. In the first phase, a Levenshtein distance based algorithm is employed to identify partial reduplicated forms within Bengali text. This algorithm assesses the similarity between words and determines whether a partially reduplicated structure exists, thereby flagging relevant instances of reduplicated forms. Moving to the second phase, the performance of the first phase is enhanced through the application of a noble technique known as Word Expansion . By doing so, the performance of the reduplication identification process are significantly improved, leading to more accurate results. Evaluation metrics include Precision (90.00%), Recall (85.71%), and F1-Score (87.80%). These high scores underscore the system’s capability to successfully identify and categorize partially reduplicated MWEs within Bengali text. Moreover, the performance metrics also demonstrate that this approach surpasses the current state-of-the-art methods for identifying reduplicated expressions in Bengali text, reaffirming its efficacy and relevance.
ISSN:2661-8907
2662-995X
2661-8907
DOI:10.1007/s42979-024-03069-9