Aya Expanse: Combining Research Breakthroughs for a New Multilingual Frontier
We introduce the Aya Expanse model family, a new generation of 8B and 32B parameter multilingual language models, aiming to address the critical challenge of developing highly performant multilingual models that match or surpass the capabilities of monolingual models. By leveraging several years of...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We introduce the Aya Expanse model family, a new generation of 8B and 32B
parameter multilingual language models, aiming to address the critical
challenge of developing highly performant multilingual models that match or
surpass the capabilities of monolingual models. By leveraging several years of
research at Cohere For AI and Cohere, including advancements in data arbitrage,
multilingual preference training, and model merging, Aya Expanse sets a new
state-of-the-art in multilingual performance. Our evaluations on the
Arena-Hard-Auto dataset, translated into 23 languages, demonstrate that Aya
Expanse 8B and 32B outperform leading open-weight models in their respective
parameter classes, including Gemma 2, Qwen 2.5, and Llama 3.1, achieving up to
a 76.6% win-rate. Notably, Aya Expanse 32B outperforms Llama 3.1 70B, a model
with twice as many parameters, achieving a 54.0% win-rate. In this short
technical report, we present extended evaluation results for the Aya Expanse
model family and release their open-weights, together with a new multilingual
evaluation dataset m-ArenaHard. |
---|---|
DOI: | 10.48550/arxiv.2412.04261 |