Chinese toponym recognition with variant neural structures from social media messages based on BERT methods
Many natural language tasks related to geographic information retrieval (GIR) require toponym recognition, and identifying Chinese toponyms from social media messages to share real-time information is a critical problem for many practical applications, such as natural disaster response and geolocati...
Gespeichert in:
Veröffentlicht in: | Journal of geographical systems 2022-04, Vol.24 (2), p.143-169 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Many natural language tasks related to geographic information retrieval (GIR) require toponym recognition, and identifying Chinese toponyms from social media messages to share real-time information is a critical problem for many practical applications, such as natural disaster response and geolocating. In this article, we focused on toponym recognition from social media messages in Chinese. While existing off-the-shelf Chinese named entity recognition (NER) tools could be applied to identify toponyms, these approaches cannot address a variety of language irregularities taken from social media messages, including location name abbreviations, informal sentence structures and combination toponyms. We present a deep neural network named BERT-BiLSTM-CRF, which extends a basic bidirectional recurrent neural network model (BiLSTM) with the pretraining bidirectional encoder representation from transformers (BERT) representation to handle the toponym recognition task in Chinese text. Using three datasets taken from lists of alternative location names, the experimental results showed that the proposed model can significantly outperform previous Chinese NER models/algorithms and a set of state-of-the-art deep learning models. |
---|---|
ISSN: | 1435-5930 1435-5949 |
DOI: | 10.1007/s10109-022-00375-9 |