FEDS-ICL: Enhancing translation ability and efficiency of large language model by optimizing demonstration selection

Large language models (LLMs) that exhibit a remarkable ability by in-context learning (ICL) with bilingual demonstrations have been recognized as a potential solution for machine translation. However, the process of selecting these demonstrations from vast datastores is notoriously time-consuming an...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Information processing & management 2024-09, Vol.61 (5), p.103825, Article 103825
Hauptverfasser:	Zhu, Shaolin, Pan, Leiyu, Xiong, Deyi
Format:	Artikel
Sprache:	eng
Schlagworte:	In-context learning Large language model Machine translation Natural language processing
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Large language models (LLMs) that exhibit a remarkable ability by in-context learning (ICL) with bilingual demonstrations have been recognized as a potential solution for machine translation. However, the process of selecting these demonstrations from vast datastores is notoriously time-consuming and inefficient. Moreover, the strategies for designing effective in-context demonstrations are not well-established. To address these critical gaps, we introduce a novel Fast and Effective approach for Demonstration Selection in-Context learning (FEDS-ICL) tailored to LLMs. Our method is designed to mainly enhance the efficiency and accuracy of translation of LLMs. Our approach revolutionizes demonstration selection by designing new product quantization technique that rapidly extracts neighboring target tokens from a strategically curated subset of sentences. This method significantly deviates from the conventional exhaustive search across entire datastores, leading to a remarkable increase in speed. Furthermore, FEDS-ICL pioneers an innovative template design for in-context demonstrations, specifically crafted to amplify the translation capabilities of multilingual LLMs. In experiments, we compare our FEDS-ICL with various existing methods on across diverse language pairs on ten different LLMs. The results reveal an up to 2.1-fold increase in selection speed and an impressive enhancement in translation accuracy, outperforming existing baselines by up to 2.0 BLEU points at least on ten different LLMs. The ablation study show the proposed product quantization and multi-view demonstration can effectively enhance the efficiency and accuracy of LLMs in machine translation. The analysis on robustness of FEDS-ICL shows that the incorporation of a greater number of demonstrations can lead a positive correlation between the quantity of contextually rich demonstrations and the translation quality of LLMs. These advancements position FEDS-ICL as a transformative methodology in the domain of machine translation and pattern analysis, marking a significant leap towards more efficient and precise machine translation. •We explore how to enhance translation ability and efficiency of large language model.•A new product quantization technique to accelerate selecting demonstrations.•An innovative template design for in-context learning to implement machine translation.
ISSN:	0306-4573 1873-5371
DOI:	10.1016/j.ipm.2024.103825