ProxyLM: Predicting Language Model Performance on Multilingual Tasks via Proxy Models
Performance prediction is a method to estimate the performance of Language Models (LMs) on various Natural Language Processing (NLP) tasks, mitigating computational costs associated with model capacity and data for fine-tuning. Our paper presents ProxyLM, a scalable task- and language-agnostic frame...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Performance prediction is a method to estimate the performance of Language
Models (LMs) on various Natural Language Processing (NLP) tasks, mitigating
computational costs associated with model capacity and data for fine-tuning.
Our paper presents ProxyLM, a scalable task- and language-agnostic framework
designed to predict the performance of LMs using proxy models. These proxy
models act as surrogates, approximating the performance of the LM of interest.
By leveraging these proxy models, ProxyLM significantly reduces computational
overhead in task evaluations, achieving up to a 37.08x speedup over traditional
methods, even with our smallest proxy models. Our results across multiple
multilingual NLP tasks and various robustness tests demonstrate that ProxyLM
not only adapts well to previously unseen languages in pre-trained LMs, but
also generalizes effectively across different datasets, outperforming the
state-of-the-art by at least 1.78x in terms of root-mean-square error (RMSE). |
---|---|
DOI: | 10.48550/arxiv.2406.09334 |