ViANLI: Adversarial Natural Language Inference for Vietnamese
The development of Natural Language Processing (NLI) datasets and models has been inspired by innovations in annotation design. With the rapid development of machine learning models today, the performance of existing machine learning models has quickly reached state-of-the-art results on a variety o...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The development of Natural Language Processing (NLI) datasets and models has
been inspired by innovations in annotation design. With the rapid development
of machine learning models today, the performance of existing machine learning
models has quickly reached state-of-the-art results on a variety of tasks
related to natural language processing, including natural language inference
tasks. By using a pre-trained model during the annotation process, it is
possible to challenge current NLI models by having humans produce
premise-hypothesis combinations that the machine model cannot correctly
predict. To remain attractive and challenging in the research of natural
language inference for Vietnamese, in this paper, we introduce the adversarial
NLI dataset to the NLP research community with the name ViANLI. This data set
contains more than 10K premise-hypothesis pairs and is built by a continuously
adjusting process to obtain the most out of the patterns generated by the
annotators. ViANLI dataset has brought many difficulties to many current SOTA
models when the accuracy of the most powerful model on the test set only
reached 48.4%. Additionally, the experimental results show that the models
trained on our dataset have significantly improved the results on other
Vietnamese NLI datasets. |
---|---|
DOI: | 10.48550/arxiv.2406.17716 |