An Upgraded Approach for Identifying Partially Reduplicated Forms in Bengali Text
This paper presents a concise methodology for the detection of partially reduplicated Multi-Word Expressions (MWEs) in Bengali texts. The entire process of identifying such reduplicated forms is carried out in two distinct phases, each contributing to the accuracy and effectiveness of the overall ap...
Gespeichert in:
Veröffentlicht in: | SN computer science 2024-09, Vol.5 (7), p.892, Article 892 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper presents a concise methodology for the detection of partially reduplicated
Multi-Word Expressions
(MWEs) in Bengali texts. The entire process of identifying such reduplicated forms is carried out in two distinct phases, each contributing to the accuracy and effectiveness of the overall approach. In the first phase, a Levenshtein distance based algorithm is employed to identify partial reduplicated forms within Bengali text. This algorithm assesses the similarity between words and determines whether a partially reduplicated structure exists, thereby flagging relevant instances of reduplicated forms. Moving to the second phase, the performance of the first phase is enhanced through the application of a noble technique known as
Word Expansion
. By doing so, the performance of the reduplication identification process are significantly improved, leading to more accurate results. Evaluation metrics include Precision (90.00%), Recall (85.71%), and F1-Score (87.80%). These high scores underscore the system’s capability to successfully identify and categorize partially reduplicated MWEs within Bengali text. Moreover, the performance metrics also demonstrate that this approach surpasses the current state-of-the-art methods for identifying reduplicated expressions in Bengali text, reaffirming its efficacy and relevance. |
---|---|
ISSN: | 2661-8907 2662-995X 2661-8907 |
DOI: | 10.1007/s42979-024-03069-9 |