Cross-Lingual Knowledge Distillation for Answer Sentence Selection in Low-Resource Languages
Findings of the Association for Computational Linguistics: ACL 2023 While impressive performance has been achieved on the task of Answer Sentence Selection (AS2) for English, the same does not hold for languages that lack large labeled datasets. In this work, we propose Cross-Lingual Knowledge Disti...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Findings of the Association for Computational Linguistics: ACL
2023 While impressive performance has been achieved on the task of Answer Sentence
Selection (AS2) for English, the same does not hold for languages that lack
large labeled datasets. In this work, we propose Cross-Lingual Knowledge
Distillation (CLKD) from a strong English AS2 teacher as a method to train AS2
models for low-resource languages in the tasks without the need of labeled data
for the target language. To evaluate our method, we introduce 1) Xtr-WikiQA, a
translation-based WikiQA dataset for 9 additional languages, and 2) TyDi-AS2, a
multilingual AS2 dataset with over 70K questions spanning 8 typologically
diverse languages. We conduct extensive experiments on Xtr-WikiQA and TyDi-AS2
with multiple teachers, diverse monolingual and multilingual pretrained
language models (PLMs) as students, and both monolingual and multilingual
training. The results demonstrate that CLKD either outperforms or rivals even
supervised fine-tuning with the same amount of labeled data and a combination
of machine translation and the teacher model. Our method can potentially enable
stronger AS2 models for low-resource languages, while TyDi-AS2 can serve as the
largest multilingual AS2 dataset for further studies in the research community. |
---|---|
DOI: | 10.48550/arxiv.2305.16302 |