Chinese toponym recognition with variant neural structures from social media messages based on BERT methods

Many natural language tasks related to geographic information retrieval (GIR) require toponym recognition, and identifying Chinese toponyms from social media messages to share real-time information is a critical problem for many practical applications, such as natural disaster response and geolocati...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of geographical systems 2022-04, Vol.24 (2), p.143-169
Hauptverfasser:	Ma, Kai, Tan, YongJian, Xie, Zhong, Qiu, Qinjun, Chen, Siqiong
Format:	Artikel
Sprache:	eng
Schlagworte:	Abbreviations Algorithms Analysis Artificial neural networks Coders Computer Appl. in Social and Behavioral Sciences Deep learning Digital media Disaster management Econometrics Economics Economics and Finance Geographical Information Systems/Cartography Geospatial data Information processing Information retrieval Landscape/Regional and Urban Planning Language Machine learning Messages Methods Natural disasters Neural networks Original Article Recognition Recurrent neural networks Regional/Spatial Science Representations Social interactions Social media Social networks Urban Economics
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Many natural language tasks related to geographic information retrieval (GIR) require toponym recognition, and identifying Chinese toponyms from social media messages to share real-time information is a critical problem for many practical applications, such as natural disaster response and geolocating. In this article, we focused on toponym recognition from social media messages in Chinese. While existing off-the-shelf Chinese named entity recognition (NER) tools could be applied to identify toponyms, these approaches cannot address a variety of language irregularities taken from social media messages, including location name abbreviations, informal sentence structures and combination toponyms. We present a deep neural network named BERT-BiLSTM-CRF, which extends a basic bidirectional recurrent neural network model (BiLSTM) with the pretraining bidirectional encoder representation from transformers (BERT) representation to handle the toponym recognition task in Chinese text. Using three datasets taken from lists of alternative location names, the experimental results showed that the proposed model can significantly outperform previous Chinese NER models/algorithms and a set of state-of-the-art deep learning models.
ISSN:	1435-5930 1435-5949
DOI:	10.1007/s10109-022-00375-9