Chinese toponym recognition with variant neural structures from social media messages based on BERT methods
Many natural language tasks related to geographic information retrieval (GIR) require toponym recognition, and identifying Chinese toponyms from social media messages to share real-time information is a critical problem for many practical applications, such as natural disaster response and geolocati...
Gespeichert in:
Veröffentlicht in: | Journal of geographical systems 2022-04, Vol.24 (2), p.143-169 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 169 |
---|---|
container_issue | 2 |
container_start_page | 143 |
container_title | Journal of geographical systems |
container_volume | 24 |
creator | Ma, Kai Tan, YongJian Xie, Zhong Qiu, Qinjun Chen, Siqiong |
description | Many natural language tasks related to geographic information retrieval (GIR) require toponym recognition, and identifying Chinese toponyms from social media messages to share real-time information is a critical problem for many practical applications, such as natural disaster response and geolocating. In this article, we focused on toponym recognition from social media messages in Chinese. While existing off-the-shelf Chinese named entity recognition (NER) tools could be applied to identify toponyms, these approaches cannot address a variety of language irregularities taken from social media messages, including location name abbreviations, informal sentence structures and combination toponyms. We present a deep neural network named BERT-BiLSTM-CRF, which extends a basic bidirectional recurrent neural network model (BiLSTM) with the pretraining bidirectional encoder representation from transformers (BERT) representation to handle the toponym recognition task in Chinese text. Using three datasets taken from lists of alternative location names, the experimental results showed that the proposed model can significantly outperform previous Chinese NER models/algorithms and a set of state-of-the-art deep learning models. |
doi_str_mv | 10.1007/s10109-022-00375-9 |
format | Article |
fullrecord | <record><control><sourceid>gale_proqu</sourceid><recordid>TN_cdi_proquest_journals_2659825630</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A702702937</galeid><sourcerecordid>A702702937</sourcerecordid><originalsourceid>FETCH-LOGICAL-c492t-b6f98372ced3f5d4a45ca9ae2604db769038ff7eaffc4a70a61c19c634e4e1253</originalsourceid><addsrcrecordid>eNp9UV1rHCEUHUoLTdL-gT4JfZ70-jWOj8mSj0KgUNJncZ3rrumObtRJyL-v7ZTmrSheOfec48XTdZ8onFMA9aVQoKB7YKwH4Er2-k13QgWXvdRCv_135_C-Oy3lAYAqSdVJ93OzDxELkpqOKb7MJKNLuxhqSJE8h7onTzYHGyuJuGR7IKXmxdUlYyE-p5mU5EKDZ5yCbWcpdtdaW1twIs3i8ur7fYPrPk3lQ_fO20PBj3_rWffj-up-c9vffbv5urm4653QrPbbweuRK-Zw4l5OwgrprLbIBhDTVg0a-Oi9Quu9E1aBHaij2g1coEDKJD_rPq--x5weFyzVPKQlx_akYYPUI5MDh8Y6X1k7e0ATok81W9fWhHNwKaIPDb9QwNrWXDUBWwUup1IyenPMYbb5xVAwv1MwawqmpWD-pGB0E5FV1L41hvIqGSm0UegoGoWvlNKacYf5ddz_GP8Ctw2WCA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2659825630</pqid></control><display><type>article</type><title>Chinese toponym recognition with variant neural structures from social media messages based on BERT methods</title><source>Business Source Complete</source><source>SpringerNature Journals</source><creator>Ma, Kai ; Tan, YongJian ; Xie, Zhong ; Qiu, Qinjun ; Chen, Siqiong</creator><creatorcontrib>Ma, Kai ; Tan, YongJian ; Xie, Zhong ; Qiu, Qinjun ; Chen, Siqiong</creatorcontrib><description>Many natural language tasks related to geographic information retrieval (GIR) require toponym recognition, and identifying Chinese toponyms from social media messages to share real-time information is a critical problem for many practical applications, such as natural disaster response and geolocating. In this article, we focused on toponym recognition from social media messages in Chinese. While existing off-the-shelf Chinese named entity recognition (NER) tools could be applied to identify toponyms, these approaches cannot address a variety of language irregularities taken from social media messages, including location name abbreviations, informal sentence structures and combination toponyms. We present a deep neural network named BERT-BiLSTM-CRF, which extends a basic bidirectional recurrent neural network model (BiLSTM) with the pretraining bidirectional encoder representation from transformers (BERT) representation to handle the toponym recognition task in Chinese text. Using three datasets taken from lists of alternative location names, the experimental results showed that the proposed model can significantly outperform previous Chinese NER models/algorithms and a set of state-of-the-art deep learning models.</description><identifier>ISSN: 1435-5930</identifier><identifier>EISSN: 1435-5949</identifier><identifier>DOI: 10.1007/s10109-022-00375-9</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Abbreviations ; Algorithms ; Analysis ; Artificial neural networks ; Coders ; Computer Appl. in Social and Behavioral Sciences ; Deep learning ; Digital media ; Disaster management ; Econometrics ; Economics ; Economics and Finance ; Geographical Information Systems/Cartography ; Geospatial data ; Information processing ; Information retrieval ; Landscape/Regional and Urban Planning ; Language ; Machine learning ; Messages ; Methods ; Natural disasters ; Neural networks ; Original Article ; Recognition ; Recurrent neural networks ; Regional/Spatial Science ; Representations ; Social interactions ; Social media ; Social networks ; Urban Economics</subject><ispartof>Journal of geographical systems, 2022-04, Vol.24 (2), p.143-169</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022</rights><rights>COPYRIGHT 2022 Springer</rights><rights>The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c492t-b6f98372ced3f5d4a45ca9ae2604db769038ff7eaffc4a70a61c19c634e4e1253</citedby><cites>FETCH-LOGICAL-c492t-b6f98372ced3f5d4a45ca9ae2604db769038ff7eaffc4a70a61c19c634e4e1253</cites><orcidid>0000-0002-9850-3751</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10109-022-00375-9$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10109-022-00375-9$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>315,782,786,27933,27934,41497,42566,51328</link.rule.ids></links><search><creatorcontrib>Ma, Kai</creatorcontrib><creatorcontrib>Tan, YongJian</creatorcontrib><creatorcontrib>Xie, Zhong</creatorcontrib><creatorcontrib>Qiu, Qinjun</creatorcontrib><creatorcontrib>Chen, Siqiong</creatorcontrib><title>Chinese toponym recognition with variant neural structures from social media messages based on BERT methods</title><title>Journal of geographical systems</title><addtitle>J Geogr Syst</addtitle><description>Many natural language tasks related to geographic information retrieval (GIR) require toponym recognition, and identifying Chinese toponyms from social media messages to share real-time information is a critical problem for many practical applications, such as natural disaster response and geolocating. In this article, we focused on toponym recognition from social media messages in Chinese. While existing off-the-shelf Chinese named entity recognition (NER) tools could be applied to identify toponyms, these approaches cannot address a variety of language irregularities taken from social media messages, including location name abbreviations, informal sentence structures and combination toponyms. We present a deep neural network named BERT-BiLSTM-CRF, which extends a basic bidirectional recurrent neural network model (BiLSTM) with the pretraining bidirectional encoder representation from transformers (BERT) representation to handle the toponym recognition task in Chinese text. Using three datasets taken from lists of alternative location names, the experimental results showed that the proposed model can significantly outperform previous Chinese NER models/algorithms and a set of state-of-the-art deep learning models.</description><subject>Abbreviations</subject><subject>Algorithms</subject><subject>Analysis</subject><subject>Artificial neural networks</subject><subject>Coders</subject><subject>Computer Appl. in Social and Behavioral Sciences</subject><subject>Deep learning</subject><subject>Digital media</subject><subject>Disaster management</subject><subject>Econometrics</subject><subject>Economics</subject><subject>Economics and Finance</subject><subject>Geographical Information Systems/Cartography</subject><subject>Geospatial data</subject><subject>Information processing</subject><subject>Information retrieval</subject><subject>Landscape/Regional and Urban Planning</subject><subject>Language</subject><subject>Machine learning</subject><subject>Messages</subject><subject>Methods</subject><subject>Natural disasters</subject><subject>Neural networks</subject><subject>Original Article</subject><subject>Recognition</subject><subject>Recurrent neural networks</subject><subject>Regional/Spatial Science</subject><subject>Representations</subject><subject>Social interactions</subject><subject>Social media</subject><subject>Social networks</subject><subject>Urban Economics</subject><issn>1435-5930</issn><issn>1435-5949</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>8G5</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNp9UV1rHCEUHUoLTdL-gT4JfZ70-jWOj8mSj0KgUNJncZ3rrumObtRJyL-v7ZTmrSheOfec48XTdZ8onFMA9aVQoKB7YKwH4Er2-k13QgWXvdRCv_135_C-Oy3lAYAqSdVJ93OzDxELkpqOKb7MJKNLuxhqSJE8h7onTzYHGyuJuGR7IKXmxdUlYyE-p5mU5EKDZ5yCbWcpdtdaW1twIs3i8ur7fYPrPk3lQ_fO20PBj3_rWffj-up-c9vffbv5urm4653QrPbbweuRK-Zw4l5OwgrprLbIBhDTVg0a-Oi9Quu9E1aBHaij2g1coEDKJD_rPq--x5weFyzVPKQlx_akYYPUI5MDh8Y6X1k7e0ATok81W9fWhHNwKaIPDb9QwNrWXDUBWwUup1IyenPMYbb5xVAwv1MwawqmpWD-pGB0E5FV1L41hvIqGSm0UegoGoWvlNKacYf5ddz_GP8Ctw2WCA</recordid><startdate>20220401</startdate><enddate>20220401</enddate><creator>Ma, Kai</creator><creator>Tan, YongJian</creator><creator>Xie, Zhong</creator><creator>Qiu, Qinjun</creator><creator>Chen, Siqiong</creator><general>Springer Berlin Heidelberg</general><general>Springer</general><general>Springer Nature B.V</general><scope>OQ6</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>88I</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>8G5</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ATCPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>BKSAR</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>KR7</scope><scope>L.-</scope><scope>L6V</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M2O</scope><scope>M2P</scope><scope>M7S</scope><scope>MBDVC</scope><scope>PADUT</scope><scope>PATMY</scope><scope>PCBAR</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PTHSS</scope><scope>PYCSY</scope><scope>PYYUZ</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0002-9850-3751</orcidid></search><sort><creationdate>20220401</creationdate><title>Chinese toponym recognition with variant neural structures from social media messages based on BERT methods</title><author>Ma, Kai ; Tan, YongJian ; Xie, Zhong ; Qiu, Qinjun ; Chen, Siqiong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c492t-b6f98372ced3f5d4a45ca9ae2604db769038ff7eaffc4a70a61c19c634e4e1253</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Abbreviations</topic><topic>Algorithms</topic><topic>Analysis</topic><topic>Artificial neural networks</topic><topic>Coders</topic><topic>Computer Appl. in Social and Behavioral Sciences</topic><topic>Deep learning</topic><topic>Digital media</topic><topic>Disaster management</topic><topic>Econometrics</topic><topic>Economics</topic><topic>Economics and Finance</topic><topic>Geographical Information Systems/Cartography</topic><topic>Geospatial data</topic><topic>Information processing</topic><topic>Information retrieval</topic><topic>Landscape/Regional and Urban Planning</topic><topic>Language</topic><topic>Machine learning</topic><topic>Messages</topic><topic>Methods</topic><topic>Natural disasters</topic><topic>Neural networks</topic><topic>Original Article</topic><topic>Recognition</topic><topic>Recurrent neural networks</topic><topic>Regional/Spatial Science</topic><topic>Representations</topic><topic>Social interactions</topic><topic>Social media</topic><topic>Social networks</topic><topic>Urban Economics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ma, Kai</creatorcontrib><creatorcontrib>Tan, YongJian</creatorcontrib><creatorcontrib>Xie, Zhong</creatorcontrib><creatorcontrib>Qiu, Qinjun</creatorcontrib><creatorcontrib>Chen, Siqiong</creatorcontrib><collection>ECONIS</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>Access via ABI/INFORM (ProQuest)</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Global (Alumni Edition)</collection><collection>Science Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Research Library (Alumni Edition)</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Agricultural & Environmental Science Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>Earth, Atmospheric & Aquatic Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Civil Engineering Abstracts</collection><collection>ABI/INFORM Professional Advanced</collection><collection>ProQuest Engineering Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Global</collection><collection>Research Library</collection><collection>Science Database</collection><collection>Engineering Database</collection><collection>Research Library (Corporate)</collection><collection>Research Library China</collection><collection>Environmental Science Database</collection><collection>Earth, Atmospheric & Aquatic Science Database</collection><collection>ProQuest One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>Engineering Collection</collection><collection>Environmental Science Collection</collection><collection>ABI/INFORM Collection China</collection><collection>ProQuest Central Basic</collection><jtitle>Journal of geographical systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ma, Kai</au><au>Tan, YongJian</au><au>Xie, Zhong</au><au>Qiu, Qinjun</au><au>Chen, Siqiong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Chinese toponym recognition with variant neural structures from social media messages based on BERT methods</atitle><jtitle>Journal of geographical systems</jtitle><stitle>J Geogr Syst</stitle><date>2022-04-01</date><risdate>2022</risdate><volume>24</volume><issue>2</issue><spage>143</spage><epage>169</epage><pages>143-169</pages><issn>1435-5930</issn><eissn>1435-5949</eissn><abstract>Many natural language tasks related to geographic information retrieval (GIR) require toponym recognition, and identifying Chinese toponyms from social media messages to share real-time information is a critical problem for many practical applications, such as natural disaster response and geolocating. In this article, we focused on toponym recognition from social media messages in Chinese. While existing off-the-shelf Chinese named entity recognition (NER) tools could be applied to identify toponyms, these approaches cannot address a variety of language irregularities taken from social media messages, including location name abbreviations, informal sentence structures and combination toponyms. We present a deep neural network named BERT-BiLSTM-CRF, which extends a basic bidirectional recurrent neural network model (BiLSTM) with the pretraining bidirectional encoder representation from transformers (BERT) representation to handle the toponym recognition task in Chinese text. Using three datasets taken from lists of alternative location names, the experimental results showed that the proposed model can significantly outperform previous Chinese NER models/algorithms and a set of state-of-the-art deep learning models.</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s10109-022-00375-9</doi><tpages>27</tpages><orcidid>https://orcid.org/0000-0002-9850-3751</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1435-5930 |
ispartof | Journal of geographical systems, 2022-04, Vol.24 (2), p.143-169 |
issn | 1435-5930 1435-5949 |
language | eng |
recordid | cdi_proquest_journals_2659825630 |
source | Business Source Complete; SpringerNature Journals |
subjects | Abbreviations Algorithms Analysis Artificial neural networks Coders Computer Appl. in Social and Behavioral Sciences Deep learning Digital media Disaster management Econometrics Economics Economics and Finance Geographical Information Systems/Cartography Geospatial data Information processing Information retrieval Landscape/Regional and Urban Planning Language Machine learning Messages Methods Natural disasters Neural networks Original Article Recognition Recurrent neural networks Regional/Spatial Science Representations Social interactions Social media Social networks Urban Economics |
title | Chinese toponym recognition with variant neural structures from social media messages based on BERT methods |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-01T02%3A36%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Chinese%20toponym%20recognition%20with%20variant%20neural%20structures%20from%20social%20media%20messages%20based%20on%20BERT%20methods&rft.jtitle=Journal%20of%20geographical%20systems&rft.au=Ma,%20Kai&rft.date=2022-04-01&rft.volume=24&rft.issue=2&rft.spage=143&rft.epage=169&rft.pages=143-169&rft.issn=1435-5930&rft.eissn=1435-5949&rft_id=info:doi/10.1007/s10109-022-00375-9&rft_dat=%3Cgale_proqu%3EA702702937%3C/gale_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2659825630&rft_id=info:pmid/&rft_galeid=A702702937&rfr_iscdi=true |