Event Extraction in Basque: Typologically motivated Cross-Lingual Transfer-Learning Analysis
Cross-lingual transfer-learning is widely used in Event Extraction for low-resource languages and involves a Multilingual Language Model that is trained in a source language and applied to the target language. This paper studies whether the typological similarity between source and target languages...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Cross-lingual transfer-learning is widely used in Event Extraction for
low-resource languages and involves a Multilingual Language Model that is
trained in a source language and applied to the target language. This paper
studies whether the typological similarity between source and target languages
impacts the performance of cross-lingual transfer, an under-explored topic. We
first focus on Basque as the target language, which is an ideal target language
because it is typologically different from surrounding languages. Our
experiments on three Event Extraction tasks show that the shared linguistic
characteristic between source and target languages does have an impact on
transfer quality. Further analysis of 72 language pairs reveals that for tasks
that involve token classification such as entity and event trigger
identification, common writing script and morphological features produce higher
quality cross-lingual transfer. In contrast, for tasks involving structural
prediction like argument extraction, common word order is the most relevant
feature. In addition, we show that when increasing the training size, not all
the languages scale in the same way in the cross-lingual setting. To perform
the experiments we introduce EusIE, an event extraction dataset for Basque,
which follows the Multilingual Event Extraction dataset (MEE). The dataset and
code are publicly available. |
---|---|
DOI: | 10.48550/arxiv.2404.06392 |