ALBETO and DistilBETO: Lightweight Spanish Language Models
In recent years there have been considerable advances in pre-trained language models, where non-English language versions have also been made available. Due to their increasing use, many lightweight versions of these models (with reduced parameters) have also been released to speed up training and i...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In recent years there have been considerable advances in pre-trained language
models, where non-English language versions have also been made available. Due
to their increasing use, many lightweight versions of these models (with
reduced parameters) have also been released to speed up training and inference
times. However, versions of these lighter models (e.g., ALBERT, DistilBERT) for
languages other than English are still scarce. In this paper we present ALBETO
and DistilBETO, which are versions of ALBERT and DistilBERT pre-trained
exclusively on Spanish corpora. We train several versions of ALBETO ranging
from 5M to 223M parameters and one of DistilBETO with 67M parameters. We
evaluate our models in the GLUES benchmark that includes various natural
language understanding tasks in Spanish. The results show that our lightweight
models achieve competitive results to those of BETO (Spanish-BERT) despite
having fewer parameters. More specifically, our larger ALBETO model outperforms
all other models on the MLDoc, PAWS-X, XNLI, MLQA, SQAC and XQuAD datasets.
However, BETO remains unbeaten for POS and NER. As a further contribution, all
models are publicly available to the community for future research. |
---|---|
DOI: | 10.48550/arxiv.2204.09145 |