Probabilistic vs deep learning based approaches for narrow domain NER in Spanish

This work presents an experimental study on the task of Named Entity Recognition (NER) for a narrow domain in Spanish language. This study considers two approaches commonly used in this kind of problem, namely, a Conditional Random Fields (CRF) model and Recurrent Neural Network (RNN). For the latte...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of intelligent & fuzzy systems 2020-01, Vol.39 (2), p.2015-2025
Hauptverfasser:	Ramos-Flores, Orlando, Pinto, David, Montes-y-Gómez, Manuel, Vázquez, Andrés
Format:	Artikel
Sprache:	eng
Schlagworte:	Conditional random fields Datasets Deep learning Domains Machine learning Probabilistic models Probability theory Recurrent neural networks Spanish language Training
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This work presents an experimental study on the task of Named Entity Recognition (NER) for a narrow domain in Spanish language. This study considers two approaches commonly used in this kind of problem, namely, a Conditional Random Fields (CRF) model and Recurrent Neural Network (RNN). For the latter, we employed a bidirectional Long Short-Term Memory with ELMO’s pre-trained word embeddings for Spanish. The comparison between the probabilistic model and the deep learning model was carried out in two collections, the Spanish dataset from CoNLL-2002 considering four classes under the IOB tagging schema, and a Mexican Spanish news dataset with seventeen classes under IOBES schema. The paper presents an analysis about the scalability, robustness, and common errors of both models. This analysis indicates in general that the BiLSTM-ELMo model is more suitable than the CRF model when there is “enough” training data, and also that it is more scalable, as its performance was not significantly affected in the incremental experiments (by adding one class at a time). On the other hand, results indicate that the CRF model is more adequate for scenarios having small training datasets and many classes.
ISSN:	1064-1246 1875-8967
DOI:	10.3233/JIFS-179868