Chinese toponym recognition with variant neural structures from social media messages based on BERT methods

Many natural language tasks related to geographic information retrieval (GIR) require toponym recognition, and identifying Chinese toponyms from social media messages to share real-time information is a critical problem for many practical applications, such as natural disaster response and geolocati...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of geographical systems 2022-04, Vol.24 (2), p.143-169
Hauptverfasser: Ma, Kai, Tan, YongJian, Xie, Zhong, Qiu, Qinjun, Chen, Siqiong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 169
container_issue 2
container_start_page 143
container_title Journal of geographical systems
container_volume 24
creator Ma, Kai
Tan, YongJian
Xie, Zhong
Qiu, Qinjun
Chen, Siqiong
description Many natural language tasks related to geographic information retrieval (GIR) require toponym recognition, and identifying Chinese toponyms from social media messages to share real-time information is a critical problem for many practical applications, such as natural disaster response and geolocating. In this article, we focused on toponym recognition from social media messages in Chinese. While existing off-the-shelf Chinese named entity recognition (NER) tools could be applied to identify toponyms, these approaches cannot address a variety of language irregularities taken from social media messages, including location name abbreviations, informal sentence structures and combination toponyms. We present a deep neural network named BERT-BiLSTM-CRF, which extends a basic bidirectional recurrent neural network model (BiLSTM) with the pretraining bidirectional encoder representation from transformers (BERT) representation to handle the toponym recognition task in Chinese text. Using three datasets taken from lists of alternative location names, the experimental results showed that the proposed model can significantly outperform previous Chinese NER models/algorithms and a set of state-of-the-art deep learning models.
doi_str_mv 10.1007/s10109-022-00375-9
format Article
fullrecord <record><control><sourceid>gale_proqu</sourceid><recordid>TN_cdi_proquest_journals_2659825630</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A702702937</galeid><sourcerecordid>A702702937</sourcerecordid><originalsourceid>FETCH-LOGICAL-c492t-b6f98372ced3f5d4a45ca9ae2604db769038ff7eaffc4a70a61c19c634e4e1253</originalsourceid><addsrcrecordid>eNp9UV1rHCEUHUoLTdL-gT4JfZ70-jWOj8mSj0KgUNJncZ3rrumObtRJyL-v7ZTmrSheOfec48XTdZ8onFMA9aVQoKB7YKwH4Er2-k13QgWXvdRCv_135_C-Oy3lAYAqSdVJ93OzDxELkpqOKb7MJKNLuxhqSJE8h7onTzYHGyuJuGR7IKXmxdUlYyE-p5mU5EKDZ5yCbWcpdtdaW1twIs3i8ur7fYPrPk3lQ_fO20PBj3_rWffj-up-c9vffbv5urm4653QrPbbweuRK-Zw4l5OwgrprLbIBhDTVg0a-Oi9Quu9E1aBHaij2g1coEDKJD_rPq--x5weFyzVPKQlx_akYYPUI5MDh8Y6X1k7e0ATok81W9fWhHNwKaIPDb9QwNrWXDUBWwUup1IyenPMYbb5xVAwv1MwawqmpWD-pGB0E5FV1L41hvIqGSm0UegoGoWvlNKacYf5ddz_GP8Ctw2WCA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2659825630</pqid></control><display><type>article</type><title>Chinese toponym recognition with variant neural structures from social media messages based on BERT methods</title><source>Business Source Complete</source><source>SpringerNature Journals</source><creator>Ma, Kai ; Tan, YongJian ; Xie, Zhong ; Qiu, Qinjun ; Chen, Siqiong</creator><creatorcontrib>Ma, Kai ; Tan, YongJian ; Xie, Zhong ; Qiu, Qinjun ; Chen, Siqiong</creatorcontrib><description>Many natural language tasks related to geographic information retrieval (GIR) require toponym recognition, and identifying Chinese toponyms from social media messages to share real-time information is a critical problem for many practical applications, such as natural disaster response and geolocating. In this article, we focused on toponym recognition from social media messages in Chinese. While existing off-the-shelf Chinese named entity recognition (NER) tools could be applied to identify toponyms, these approaches cannot address a variety of language irregularities taken from social media messages, including location name abbreviations, informal sentence structures and combination toponyms. We present a deep neural network named BERT-BiLSTM-CRF, which extends a basic bidirectional recurrent neural network model (BiLSTM) with the pretraining bidirectional encoder representation from transformers (BERT) representation to handle the toponym recognition task in Chinese text. Using three datasets taken from lists of alternative location names, the experimental results showed that the proposed model can significantly outperform previous Chinese NER models/algorithms and a set of state-of-the-art deep learning models.</description><identifier>ISSN: 1435-5930</identifier><identifier>EISSN: 1435-5949</identifier><identifier>DOI: 10.1007/s10109-022-00375-9</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Abbreviations ; Algorithms ; Analysis ; Artificial neural networks ; Coders ; Computer Appl. in Social and Behavioral Sciences ; Deep learning ; Digital media ; Disaster management ; Econometrics ; Economics ; Economics and Finance ; Geographical Information Systems/Cartography ; Geospatial data ; Information processing ; Information retrieval ; Landscape/Regional and Urban Planning ; Language ; Machine learning ; Messages ; Methods ; Natural disasters ; Neural networks ; Original Article ; Recognition ; Recurrent neural networks ; Regional/Spatial Science ; Representations ; Social interactions ; Social media ; Social networks ; Urban Economics</subject><ispartof>Journal of geographical systems, 2022-04, Vol.24 (2), p.143-169</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022</rights><rights>COPYRIGHT 2022 Springer</rights><rights>The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c492t-b6f98372ced3f5d4a45ca9ae2604db769038ff7eaffc4a70a61c19c634e4e1253</citedby><cites>FETCH-LOGICAL-c492t-b6f98372ced3f5d4a45ca9ae2604db769038ff7eaffc4a70a61c19c634e4e1253</cites><orcidid>0000-0002-9850-3751</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10109-022-00375-9$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10109-022-00375-9$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>315,782,786,27933,27934,41497,42566,51328</link.rule.ids></links><search><creatorcontrib>Ma, Kai</creatorcontrib><creatorcontrib>Tan, YongJian</creatorcontrib><creatorcontrib>Xie, Zhong</creatorcontrib><creatorcontrib>Qiu, Qinjun</creatorcontrib><creatorcontrib>Chen, Siqiong</creatorcontrib><title>Chinese toponym recognition with variant neural structures from social media messages based on BERT methods</title><title>Journal of geographical systems</title><addtitle>J Geogr Syst</addtitle><description>Many natural language tasks related to geographic information retrieval (GIR) require toponym recognition, and identifying Chinese toponyms from social media messages to share real-time information is a critical problem for many practical applications, such as natural disaster response and geolocating. In this article, we focused on toponym recognition from social media messages in Chinese. While existing off-the-shelf Chinese named entity recognition (NER) tools could be applied to identify toponyms, these approaches cannot address a variety of language irregularities taken from social media messages, including location name abbreviations, informal sentence structures and combination toponyms. We present a deep neural network named BERT-BiLSTM-CRF, which extends a basic bidirectional recurrent neural network model (BiLSTM) with the pretraining bidirectional encoder representation from transformers (BERT) representation to handle the toponym recognition task in Chinese text. Using three datasets taken from lists of alternative location names, the experimental results showed that the proposed model can significantly outperform previous Chinese NER models/algorithms and a set of state-of-the-art deep learning models.</description><subject>Abbreviations</subject><subject>Algorithms</subject><subject>Analysis</subject><subject>Artificial neural networks</subject><subject>Coders</subject><subject>Computer Appl. in Social and Behavioral Sciences</subject><subject>Deep learning</subject><subject>Digital media</subject><subject>Disaster management</subject><subject>Econometrics</subject><subject>Economics</subject><subject>Economics and Finance</subject><subject>Geographical Information Systems/Cartography</subject><subject>Geospatial data</subject><subject>Information processing</subject><subject>Information retrieval</subject><subject>Landscape/Regional and Urban Planning</subject><subject>Language</subject><subject>Machine learning</subject><subject>Messages</subject><subject>Methods</subject><subject>Natural disasters</subject><subject>Neural networks</subject><subject>Original Article</subject><subject>Recognition</subject><subject>Recurrent neural networks</subject><subject>Regional/Spatial Science</subject><subject>Representations</subject><subject>Social interactions</subject><subject>Social media</subject><subject>Social networks</subject><subject>Urban Economics</subject><issn>1435-5930</issn><issn>1435-5949</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>8G5</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNp9UV1rHCEUHUoLTdL-gT4JfZ70-jWOj8mSj0KgUNJncZ3rrumObtRJyL-v7ZTmrSheOfec48XTdZ8onFMA9aVQoKB7YKwH4Er2-k13QgWXvdRCv_135_C-Oy3lAYAqSdVJ93OzDxELkpqOKb7MJKNLuxhqSJE8h7onTzYHGyuJuGR7IKXmxdUlYyE-p5mU5EKDZ5yCbWcpdtdaW1twIs3i8ur7fYPrPk3lQ_fO20PBj3_rWffj-up-c9vffbv5urm4653QrPbbweuRK-Zw4l5OwgrprLbIBhDTVg0a-Oi9Quu9E1aBHaij2g1coEDKJD_rPq--x5weFyzVPKQlx_akYYPUI5MDh8Y6X1k7e0ATok81W9fWhHNwKaIPDb9QwNrWXDUBWwUup1IyenPMYbb5xVAwv1MwawqmpWD-pGB0E5FV1L41hvIqGSm0UegoGoWvlNKacYf5ddz_GP8Ctw2WCA</recordid><startdate>20220401</startdate><enddate>20220401</enddate><creator>Ma, Kai</creator><creator>Tan, YongJian</creator><creator>Xie, Zhong</creator><creator>Qiu, Qinjun</creator><creator>Chen, Siqiong</creator><general>Springer Berlin Heidelberg</general><general>Springer</general><general>Springer Nature B.V</general><scope>OQ6</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>88I</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>8G5</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ATCPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>BKSAR</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>KR7</scope><scope>L.-</scope><scope>L6V</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M2O</scope><scope>M2P</scope><scope>M7S</scope><scope>MBDVC</scope><scope>PADUT</scope><scope>PATMY</scope><scope>PCBAR</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PTHSS</scope><scope>PYCSY</scope><scope>PYYUZ</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0002-9850-3751</orcidid></search><sort><creationdate>20220401</creationdate><title>Chinese toponym recognition with variant neural structures from social media messages based on BERT methods</title><author>Ma, Kai ; Tan, YongJian ; Xie, Zhong ; Qiu, Qinjun ; Chen, Siqiong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c492t-b6f98372ced3f5d4a45ca9ae2604db769038ff7eaffc4a70a61c19c634e4e1253</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Abbreviations</topic><topic>Algorithms</topic><topic>Analysis</topic><topic>Artificial neural networks</topic><topic>Coders</topic><topic>Computer Appl. in Social and Behavioral Sciences</topic><topic>Deep learning</topic><topic>Digital media</topic><topic>Disaster management</topic><topic>Econometrics</topic><topic>Economics</topic><topic>Economics and Finance</topic><topic>Geographical Information Systems/Cartography</topic><topic>Geospatial data</topic><topic>Information processing</topic><topic>Information retrieval</topic><topic>Landscape/Regional and Urban Planning</topic><topic>Language</topic><topic>Machine learning</topic><topic>Messages</topic><topic>Methods</topic><topic>Natural disasters</topic><topic>Neural networks</topic><topic>Original Article</topic><topic>Recognition</topic><topic>Recurrent neural networks</topic><topic>Regional/Spatial Science</topic><topic>Representations</topic><topic>Social interactions</topic><topic>Social media</topic><topic>Social networks</topic><topic>Urban Economics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ma, Kai</creatorcontrib><creatorcontrib>Tan, YongJian</creatorcontrib><creatorcontrib>Xie, Zhong</creatorcontrib><creatorcontrib>Qiu, Qinjun</creatorcontrib><creatorcontrib>Chen, Siqiong</creatorcontrib><collection>ECONIS</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>Access via ABI/INFORM (ProQuest)</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Global (Alumni Edition)</collection><collection>Science Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Research Library (Alumni Edition)</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Agricultural &amp; Environmental Science Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>Earth, Atmospheric &amp; Aquatic Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Civil Engineering Abstracts</collection><collection>ABI/INFORM Professional Advanced</collection><collection>ProQuest Engineering Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Global</collection><collection>Research Library</collection><collection>Science Database</collection><collection>Engineering Database</collection><collection>Research Library (Corporate)</collection><collection>Research Library China</collection><collection>Environmental Science Database</collection><collection>Earth, Atmospheric &amp; Aquatic Science Database</collection><collection>ProQuest One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>Engineering Collection</collection><collection>Environmental Science Collection</collection><collection>ABI/INFORM Collection China</collection><collection>ProQuest Central Basic</collection><jtitle>Journal of geographical systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ma, Kai</au><au>Tan, YongJian</au><au>Xie, Zhong</au><au>Qiu, Qinjun</au><au>Chen, Siqiong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Chinese toponym recognition with variant neural structures from social media messages based on BERT methods</atitle><jtitle>Journal of geographical systems</jtitle><stitle>J Geogr Syst</stitle><date>2022-04-01</date><risdate>2022</risdate><volume>24</volume><issue>2</issue><spage>143</spage><epage>169</epage><pages>143-169</pages><issn>1435-5930</issn><eissn>1435-5949</eissn><abstract>Many natural language tasks related to geographic information retrieval (GIR) require toponym recognition, and identifying Chinese toponyms from social media messages to share real-time information is a critical problem for many practical applications, such as natural disaster response and geolocating. In this article, we focused on toponym recognition from social media messages in Chinese. While existing off-the-shelf Chinese named entity recognition (NER) tools could be applied to identify toponyms, these approaches cannot address a variety of language irregularities taken from social media messages, including location name abbreviations, informal sentence structures and combination toponyms. We present a deep neural network named BERT-BiLSTM-CRF, which extends a basic bidirectional recurrent neural network model (BiLSTM) with the pretraining bidirectional encoder representation from transformers (BERT) representation to handle the toponym recognition task in Chinese text. Using three datasets taken from lists of alternative location names, the experimental results showed that the proposed model can significantly outperform previous Chinese NER models/algorithms and a set of state-of-the-art deep learning models.</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s10109-022-00375-9</doi><tpages>27</tpages><orcidid>https://orcid.org/0000-0002-9850-3751</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1435-5930
ispartof Journal of geographical systems, 2022-04, Vol.24 (2), p.143-169
issn 1435-5930
1435-5949
language eng
recordid cdi_proquest_journals_2659825630
source Business Source Complete; SpringerNature Journals
subjects Abbreviations
Algorithms
Analysis
Artificial neural networks
Coders
Computer Appl. in Social and Behavioral Sciences
Deep learning
Digital media
Disaster management
Econometrics
Economics
Economics and Finance
Geographical Information Systems/Cartography
Geospatial data
Information processing
Information retrieval
Landscape/Regional and Urban Planning
Language
Machine learning
Messages
Methods
Natural disasters
Neural networks
Original Article
Recognition
Recurrent neural networks
Regional/Spatial Science
Representations
Social interactions
Social media
Social networks
Urban Economics
title Chinese toponym recognition with variant neural structures from social media messages based on BERT methods
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-01T02%3A36%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Chinese%20toponym%20recognition%20with%20variant%20neural%20structures%20from%20social%20media%20messages%20based%20on%20BERT%20methods&rft.jtitle=Journal%20of%20geographical%20systems&rft.au=Ma,%20Kai&rft.date=2022-04-01&rft.volume=24&rft.issue=2&rft.spage=143&rft.epage=169&rft.pages=143-169&rft.issn=1435-5930&rft.eissn=1435-5949&rft_id=info:doi/10.1007/s10109-022-00375-9&rft_dat=%3Cgale_proqu%3EA702702937%3C/gale_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2659825630&rft_id=info:pmid/&rft_galeid=A702702937&rfr_iscdi=true