A Lexical Resource-Constrained Topic Model for Word Relatedness

Word relatedness computation is an important supporting technology for many tasks in natural language processing. Traditionally, there have been two distinct strategies for word relatedness measurement: one utilizes corpus-based models, whereas the other leverages external lexical resources. However...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2019, Vol.7, p.55261-55268
Hauptverfasser: Yin, Yongjing, Zeng, Jiali, Wang, Hongji, Wu, Keqing, Luo, Bin, Su, Jinsong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 55268
container_issue
container_start_page 55261
container_title IEEE access
container_volume 7
creator Yin, Yongjing
Zeng, Jiali
Wang, Hongji
Wu, Keqing
Luo, Bin
Su, Jinsong
description Word relatedness computation is an important supporting technology for many tasks in natural language processing. Traditionally, there have been two distinct strategies for word relatedness measurement: one utilizes corpus-based models, whereas the other leverages external lexical resources. However, either solution has its strengths and weaknesses. In this paper, we propose a lexical resource-constrained topic model to integrate the two complementary strategies effectively. Our model is an extension of probabilistic latent semantic analysis, which automatically learns word-level distributed representations forward relatedness measurement. Furthermore, we introduce generalized expectation maximization (GEM) algorithm for statistical estimation. The proposed model not merely inherit the advantage of conventional topic models in dimension reduction, but it also refines parameter estimation by using word pairs that are known to be related. The experimental results in different languages demonstrate the effectiveness of our model in topic extraction and word relatedness measurement.
doi_str_mv 10.1109/ACCESS.2019.2909104
format Article
fullrecord <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_proquest_journals_2455619333</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8703742</ieee_id><doaj_id>oai_doaj_org_article_a6b50515027141f2b10cfd759827db40</doaj_id><sourcerecordid>2455619333</sourcerecordid><originalsourceid>FETCH-LOGICAL-c408t-72ac2c4f5beae71edbb42f762a81fe0c14fdfde7cf4cea0529f20c19a5cf13d93</originalsourceid><addsrcrecordid>eNpNUMFKAzEQDaJgqf2CXhY8b02yyWZzkrJULVQEW_EYsslEtqxNTbagf2_qluJcZni892bmITQleEYIlnfzul6s1zOKiZxRiSXB7AKNKCllXvCivPw3X6NJjFucqkoQFyN0P89W8N0a3WWvEP0hGMhrv4t90O0ObLbx-9Zkz95ClzkfsncfbGJ2uge7gxhv0JXTXYTJqY_R28NiUz_lq5fHZT1f5Ybhqs8F1YYa5ngDGgQB2zSMOlFSXREH2BDmrLMgjGMGNOZUOppQqblxpLCyGKPl4Gu93qp9aD91-FFet-oP8OFD6dC3pgOly4ZjTjimgjDiaEOwcVZwWVFhG4aT1-3gtQ_-6wCxV9v0-C6dryjjvCSySDVGxcAywccYwJ23EqyOwasheHUMXp2CT6rpoGoB4KyoBC4Eo8UvuCx-Bw</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2455619333</pqid></control><display><type>article</type><title>A Lexical Resource-Constrained Topic Model for Word Relatedness</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Yin, Yongjing ; Zeng, Jiali ; Wang, Hongji ; Wu, Keqing ; Luo, Bin ; Su, Jinsong</creator><creatorcontrib>Yin, Yongjing ; Zeng, Jiali ; Wang, Hongji ; Wu, Keqing ; Luo, Bin ; Su, Jinsong</creatorcontrib><description>Word relatedness computation is an important supporting technology for many tasks in natural language processing. Traditionally, there have been two distinct strategies for word relatedness measurement: one utilizes corpus-based models, whereas the other leverages external lexical resources. However, either solution has its strengths and weaknesses. In this paper, we propose a lexical resource-constrained topic model to integrate the two complementary strategies effectively. Our model is an extension of probabilistic latent semantic analysis, which automatically learns word-level distributed representations forward relatedness measurement. Furthermore, we introduce generalized expectation maximization (GEM) algorithm for statistical estimation. The proposed model not merely inherit the advantage of conventional topic models in dimension reduction, but it also refines parameter estimation by using word pairs that are known to be related. The experimental results in different languages demonstrate the effectiveness of our model in topic extraction and word relatedness measurement.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2019.2909104</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Algorithms ; Analytical models ; Computational modeling ; Linear programming ; Natural language processing ; Parameter estimation ; Semantics ; Statistical analysis ; Task analysis ; Training ; unsupervised learning ; Words (language)</subject><ispartof>IEEE access, 2019, Vol.7, p.55261-55268</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c408t-72ac2c4f5beae71edbb42f762a81fe0c14fdfde7cf4cea0529f20c19a5cf13d93</citedby><cites>FETCH-LOGICAL-c408t-72ac2c4f5beae71edbb42f762a81fe0c14fdfde7cf4cea0529f20c19a5cf13d93</cites><orcidid>0000-0003-1138-4612</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8703742$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,864,2102,4024,27633,27923,27924,27925,54933</link.rule.ids></links><search><creatorcontrib>Yin, Yongjing</creatorcontrib><creatorcontrib>Zeng, Jiali</creatorcontrib><creatorcontrib>Wang, Hongji</creatorcontrib><creatorcontrib>Wu, Keqing</creatorcontrib><creatorcontrib>Luo, Bin</creatorcontrib><creatorcontrib>Su, Jinsong</creatorcontrib><title>A Lexical Resource-Constrained Topic Model for Word Relatedness</title><title>IEEE access</title><addtitle>Access</addtitle><description>Word relatedness computation is an important supporting technology for many tasks in natural language processing. Traditionally, there have been two distinct strategies for word relatedness measurement: one utilizes corpus-based models, whereas the other leverages external lexical resources. However, either solution has its strengths and weaknesses. In this paper, we propose a lexical resource-constrained topic model to integrate the two complementary strategies effectively. Our model is an extension of probabilistic latent semantic analysis, which automatically learns word-level distributed representations forward relatedness measurement. Furthermore, we introduce generalized expectation maximization (GEM) algorithm for statistical estimation. The proposed model not merely inherit the advantage of conventional topic models in dimension reduction, but it also refines parameter estimation by using word pairs that are known to be related. The experimental results in different languages demonstrate the effectiveness of our model in topic extraction and word relatedness measurement.</description><subject>Algorithms</subject><subject>Analytical models</subject><subject>Computational modeling</subject><subject>Linear programming</subject><subject>Natural language processing</subject><subject>Parameter estimation</subject><subject>Semantics</subject><subject>Statistical analysis</subject><subject>Task analysis</subject><subject>Training</subject><subject>unsupervised learning</subject><subject>Words (language)</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUMFKAzEQDaJgqf2CXhY8b02yyWZzkrJULVQEW_EYsslEtqxNTbagf2_qluJcZni892bmITQleEYIlnfzul6s1zOKiZxRiSXB7AKNKCllXvCivPw3X6NJjFucqkoQFyN0P89W8N0a3WWvEP0hGMhrv4t90O0ObLbx-9Zkz95ClzkfsncfbGJ2uge7gxhv0JXTXYTJqY_R28NiUz_lq5fHZT1f5Ybhqs8F1YYa5ngDGgQB2zSMOlFSXREH2BDmrLMgjGMGNOZUOppQqblxpLCyGKPl4Gu93qp9aD91-FFet-oP8OFD6dC3pgOly4ZjTjimgjDiaEOwcVZwWVFhG4aT1-3gtQ_-6wCxV9v0-C6dryjjvCSySDVGxcAywccYwJ23EqyOwasheHUMXp2CT6rpoGoB4KyoBC4Eo8UvuCx-Bw</recordid><startdate>2019</startdate><enddate>2019</enddate><creator>Yin, Yongjing</creator><creator>Zeng, Jiali</creator><creator>Wang, Hongji</creator><creator>Wu, Keqing</creator><creator>Luo, Bin</creator><creator>Su, Jinsong</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0003-1138-4612</orcidid></search><sort><creationdate>2019</creationdate><title>A Lexical Resource-Constrained Topic Model for Word Relatedness</title><author>Yin, Yongjing ; Zeng, Jiali ; Wang, Hongji ; Wu, Keqing ; Luo, Bin ; Su, Jinsong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c408t-72ac2c4f5beae71edbb42f762a81fe0c14fdfde7cf4cea0529f20c19a5cf13d93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Algorithms</topic><topic>Analytical models</topic><topic>Computational modeling</topic><topic>Linear programming</topic><topic>Natural language processing</topic><topic>Parameter estimation</topic><topic>Semantics</topic><topic>Statistical analysis</topic><topic>Task analysis</topic><topic>Training</topic><topic>unsupervised learning</topic><topic>Words (language)</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yin, Yongjing</creatorcontrib><creatorcontrib>Zeng, Jiali</creatorcontrib><creatorcontrib>Wang, Hongji</creatorcontrib><creatorcontrib>Wu, Keqing</creatorcontrib><creatorcontrib>Luo, Bin</creatorcontrib><creatorcontrib>Su, Jinsong</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yin, Yongjing</au><au>Zeng, Jiali</au><au>Wang, Hongji</au><au>Wu, Keqing</au><au>Luo, Bin</au><au>Su, Jinsong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Lexical Resource-Constrained Topic Model for Word Relatedness</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2019</date><risdate>2019</risdate><volume>7</volume><spage>55261</spage><epage>55268</epage><pages>55261-55268</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>Word relatedness computation is an important supporting technology for many tasks in natural language processing. Traditionally, there have been two distinct strategies for word relatedness measurement: one utilizes corpus-based models, whereas the other leverages external lexical resources. However, either solution has its strengths and weaknesses. In this paper, we propose a lexical resource-constrained topic model to integrate the two complementary strategies effectively. Our model is an extension of probabilistic latent semantic analysis, which automatically learns word-level distributed representations forward relatedness measurement. Furthermore, we introduce generalized expectation maximization (GEM) algorithm for statistical estimation. The proposed model not merely inherit the advantage of conventional topic models in dimension reduction, but it also refines parameter estimation by using word pairs that are known to be related. The experimental results in different languages demonstrate the effectiveness of our model in topic extraction and word relatedness measurement.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2019.2909104</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0003-1138-4612</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2169-3536
ispartof IEEE access, 2019, Vol.7, p.55261-55268
issn 2169-3536
2169-3536
language eng
recordid cdi_proquest_journals_2455619333
source IEEE Open Access Journals; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
subjects Algorithms
Analytical models
Computational modeling
Linear programming
Natural language processing
Parameter estimation
Semantics
Statistical analysis
Task analysis
Training
unsupervised learning
Words (language)
title A Lexical Resource-Constrained Topic Model for Word Relatedness
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T17%3A47%3A36IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Lexical%20Resource-Constrained%20Topic%20Model%20for%20Word%20Relatedness&rft.jtitle=IEEE%20access&rft.au=Yin,%20Yongjing&rft.date=2019&rft.volume=7&rft.spage=55261&rft.epage=55268&rft.pages=55261-55268&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2019.2909104&rft_dat=%3Cproquest_ieee_%3E2455619333%3C/proquest_ieee_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2455619333&rft_id=info:pmid/&rft_ieee_id=8703742&rft_doaj_id=oai_doaj_org_article_a6b50515027141f2b10cfd759827db40&rfr_iscdi=true