Investigating Large Language Models for Complex Word Identification in Multilingual and Multidomain Setups
Complex Word Identification (CWI) is an essential step in the lexical simplification task and has recently become a task on its own. Some variations of this binary classification task have emerged, such as lexical complexity prediction (LCP) and complexity evaluation of multi-word expressions (MWE)....
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Complex Word Identification (CWI) is an essential step in the lexical
simplification task and has recently become a task on its own. Some variations
of this binary classification task have emerged, such as lexical complexity
prediction (LCP) and complexity evaluation of multi-word expressions (MWE).
Large language models (LLMs) recently became popular in the Natural Language
Processing community because of their versatility and capability to solve
unseen tasks in zero/few-shot settings. Our work investigates LLM usage,
specifically open-source models such as Llama 2, Llama 3, and Vicuna v1.5, and
closed-source, such as ChatGPT-3.5-turbo and GPT-4o, in the CWI, LCP, and MWE
settings. We evaluate zero-shot, few-shot, and fine-tuning settings and show
that LLMs struggle in certain conditions or achieve comparable results against
existing methods. In addition, we provide some views on meta-learning combined
with prompt learning. In the end, we conclude that the current state of LLMs
cannot or barely outperform existing methods, which are usually much smaller. |
---|---|
DOI: | 10.48550/arxiv.2411.01706 |