DiTTO: A Feature Representation Imitation Approach for Improving Cross-Lingual Transfer
Zero-shot cross-lingual transfer is promising, however has been shown to be sub-optimal, with inferior transfer performance across low-resource languages. In this work, we envision languages as domains for improving zero-shot transfer by jointly reducing the feature incongruity between the source an...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Zero-shot cross-lingual transfer is promising, however has been shown to be
sub-optimal, with inferior transfer performance across low-resource languages.
In this work, we envision languages as domains for improving zero-shot transfer
by jointly reducing the feature incongruity between the source and the target
language and increasing the generalization capabilities of pre-trained
multilingual transformers. We show that our approach, DiTTO, significantly
outperforms the standard zero-shot fine-tuning method on multiple datasets
across all languages using solely unlabeled instances in the target language.
Empirical results show that jointly reducing feature incongruity for multiple
target languages is vital for successful cross-lingual transfer. Moreover, our
model enables better cross-lingual transfer than standard fine-tuning methods,
even in the few-shot setting. |
---|---|
DOI: | 10.48550/arxiv.2303.02357 |