Bootstrapping Multilingual Semantic Parsers using Large Language Models
Despite cross-lingual generalization demonstrated by pre-trained multilingual models, the translate-train paradigm of transferring English datasets across multiple languages remains to be a key mechanism for training task-specific multilingual models. However, for many low-resource languages, the av...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Despite cross-lingual generalization demonstrated by pre-trained multilingual
models, the translate-train paradigm of transferring English datasets across
multiple languages remains to be a key mechanism for training task-specific
multilingual models. However, for many low-resource languages, the availability
of a reliable translation service entails significant amounts of costly
human-annotated translation pairs. Further, translation services may continue
to be brittle due to domain mismatch between task-specific input text and
general-purpose text used for training translation models. For multilingual
semantic parsing, we demonstrate the effectiveness and flexibility offered by
large language models (LLMs) for translating English datasets into several
languages via few-shot prompting. Through extensive comparisons on two public
datasets, MTOP and MASSIVE, spanning 50 languages and several domains, we show
that our method of translating data using LLMs outperforms a strong
translate-train baseline on 41 out of 50 languages. We study the key design
choices that enable more effective multilingual data translation via prompted
LLMs. |
---|---|
DOI: | 10.48550/arxiv.2210.07313 |