Mirroring Vector Space Embedding for New Words

Most embedding models used in natural language processing require retraining of the entire model to obtain the embedding value of a new word. In the current system, as retraining is repeated, the amount of data used for learning gradually increases. It is thus very inefficient to retrain the entire...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE access 2021, Vol.9, p.99954-99967
Hauptverfasser:	Kim, Jihye, Jeong, Ok-Ran
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial intelligence Artificial neural networks Bit error rate Data models deep learning Dictionaries Embedding Hidden Markov models Learning Natural language processing neural networks new words Semantics Task analysis Training Vector space word embedding
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	99967
container_issue
container_start_page	99954
container_title	IEEE access
container_volume	9
creator	Kim, Jihye Jeong, Ok-Ran
description	Most embedding models used in natural language processing require retraining of the entire model to obtain the embedding value of a new word. In the current system, as retraining is repeated, the amount of data used for learning gradually increases. It is thus very inefficient to retrain the entire model whenever some new words emerge. Moreover, since a language has a huge number of words and its characteristics change continuously over time, it is not easy to embed all words. To solve both problems, we propose a new embedding model, the Mirroring Vector Space (MVS), which enables us to obtain a new word embedding by using the previously built word embedding model without retraining it. The MVS embedding model has a convolutional neural networks (CNN) structure and presents a novel strategy to obtain word embeddings. It predicts the embedding value of a word by learning the vector space of an existing embedding model using the explanations of the word. It also provides flexibility for external resources, reusability for training times, and portability in the point that our model can be used with any models. We verify these three attributes and the novelty in our experiments.
doi_str_mv	10.1109/ACCESS.2021.3096238
format	Article
fullrecord	<record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_proquest_journals_2553593405</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9481109</ieee_id><doaj_id>oai_doaj_org_article_5b48ddd5f8d14c7b9cb75303e2240b56</doaj_id><sourcerecordid>2553593405</sourcerecordid><originalsourceid>FETCH-LOGICAL-c408t-6d7fb695f03a96189e00f900d82524272597ef21c102942307380bbff81432d33</originalsourceid><addsrcrecordid>eNpNUMtKw0AUHUTBUvsF3QRcJ955JTPLEqoWqi7qYzlkXiWl7dRJi_j3TkwR7-ZeDudxOQhNMRQYg7yb1fV8tSoIEFxQkCWh4gKNCC5lTjktL__d12jSdRtIIxLEqxEqntoYQ2z36-zdmWOI2erQGJfNd9pZ28M-Yc_uK_sI0XY36Mo3285NznuM3u7nr_Vjvnx5WNSzZW4YiGNe2srrFOCBNrLEQjoALwGsIJwwUhEuK-cJNhiIZIRCRQVo7b3AjBJL6RgtBl8bmo06xHbXxG8Vmlb9AiGuVROPrdk6xTUT1lruhcXMVFoaXXEK1BHCQPMyed0OXocYPk-uO6pNOMV9el8RzimXlAFPLDqwTAxdF53_S8Wg-p7V0LPqe1bnnpNqOqha59yfQjLRK-gPl8d1Lg</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2553593405</pqid></control><display><type>article</type><title>Mirroring Vector Space Embedding for New Words</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Kim, Jihye ; Jeong, Ok-Ran</creator><creatorcontrib>Kim, Jihye ; Jeong, Ok-Ran</creatorcontrib><description>Most embedding models used in natural language processing require retraining of the entire model to obtain the embedding value of a new word. In the current system, as retraining is repeated, the amount of data used for learning gradually increases. It is thus very inefficient to retrain the entire model whenever some new words emerge. Moreover, since a language has a huge number of words and its characteristics change continuously over time, it is not easy to embed all words. To solve both problems, we propose a new embedding model, the Mirroring Vector Space (MVS), which enables us to obtain a new word embedding by using the previously built word embedding model without retraining it. The MVS embedding model has a convolutional neural networks (CNN) structure and presents a novel strategy to obtain word embeddings. It predicts the embedding value of a word by learning the vector space of an existing embedding model using the explanations of the word. It also provides flexibility for external resources, reusability for training times, and portability in the point that our model can be used with any models. We verify these three attributes and the novelty in our experiments.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2021.3096238</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Artificial intelligence ; Artificial neural networks ; Bit error rate ; Data models ; deep learning ; Dictionaries ; Embedding ; Hidden Markov models ; Learning ; Natural language processing ; neural networks ; new words ; Semantics ; Task analysis ; Training ; Vector space ; word embedding</subject><ispartof>IEEE access, 2021, Vol.9, p.99954-99967</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c408t-6d7fb695f03a96189e00f900d82524272597ef21c102942307380bbff81432d33</citedby><cites>FETCH-LOGICAL-c408t-6d7fb695f03a96189e00f900d82524272597ef21c102942307380bbff81432d33</cites><orcidid>0000-0002-3545-5991</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9481109$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,860,2096,4010,27610,27900,27901,27902,54908</link.rule.ids></links><search><creatorcontrib>Kim, Jihye</creatorcontrib><creatorcontrib>Jeong, Ok-Ran</creatorcontrib><title>Mirroring Vector Space Embedding for New Words</title><title>IEEE access</title><addtitle>Access</addtitle><description>Most embedding models used in natural language processing require retraining of the entire model to obtain the embedding value of a new word. In the current system, as retraining is repeated, the amount of data used for learning gradually increases. It is thus very inefficient to retrain the entire model whenever some new words emerge. Moreover, since a language has a huge number of words and its characteristics change continuously over time, it is not easy to embed all words. To solve both problems, we propose a new embedding model, the Mirroring Vector Space (MVS), which enables us to obtain a new word embedding by using the previously built word embedding model without retraining it. The MVS embedding model has a convolutional neural networks (CNN) structure and presents a novel strategy to obtain word embeddings. It predicts the embedding value of a word by learning the vector space of an existing embedding model using the explanations of the word. It also provides flexibility for external resources, reusability for training times, and portability in the point that our model can be used with any models. We verify these three attributes and the novelty in our experiments.</description><subject>Artificial intelligence</subject><subject>Artificial neural networks</subject><subject>Bit error rate</subject><subject>Data models</subject><subject>deep learning</subject><subject>Dictionaries</subject><subject>Embedding</subject><subject>Hidden Markov models</subject><subject>Learning</subject><subject>Natural language processing</subject><subject>neural networks</subject><subject>new words</subject><subject>Semantics</subject><subject>Task analysis</subject><subject>Training</subject><subject>Vector space</subject><subject>word embedding</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUMtKw0AUHUTBUvsF3QRcJ955JTPLEqoWqi7qYzlkXiWl7dRJi_j3TkwR7-ZeDudxOQhNMRQYg7yb1fV8tSoIEFxQkCWh4gKNCC5lTjktL__d12jSdRtIIxLEqxEqntoYQ2z36-zdmWOI2erQGJfNd9pZ28M-Yc_uK_sI0XY36Mo3285NznuM3u7nr_Vjvnx5WNSzZW4YiGNe2srrFOCBNrLEQjoALwGsIJwwUhEuK-cJNhiIZIRCRQVo7b3AjBJL6RgtBl8bmo06xHbXxG8Vmlb9AiGuVROPrdk6xTUT1lruhcXMVFoaXXEK1BHCQPMyed0OXocYPk-uO6pNOMV9el8RzimXlAFPLDqwTAxdF53_S8Wg-p7V0LPqe1bnnpNqOqha59yfQjLRK-gPl8d1Lg</recordid><startdate>2021</startdate><enddate>2021</enddate><creator>Kim, Jihye</creator><creator>Jeong, Ok-Ran</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-3545-5991</orcidid></search><sort><creationdate>2021</creationdate><title>Mirroring Vector Space Embedding for New Words</title><author>Kim, Jihye ; Jeong, Ok-Ran</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c408t-6d7fb695f03a96189e00f900d82524272597ef21c102942307380bbff81432d33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Artificial intelligence</topic><topic>Artificial neural networks</topic><topic>Bit error rate</topic><topic>Data models</topic><topic>deep learning</topic><topic>Dictionaries</topic><topic>Embedding</topic><topic>Hidden Markov models</topic><topic>Learning</topic><topic>Natural language processing</topic><topic>neural networks</topic><topic>new words</topic><topic>Semantics</topic><topic>Task analysis</topic><topic>Training</topic><topic>Vector space</topic><topic>word embedding</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kim, Jihye</creatorcontrib><creatorcontrib>Jeong, Ok-Ran</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kim, Jihye</au><au>Jeong, Ok-Ran</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Mirroring Vector Space Embedding for New Words</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2021</date><risdate>2021</risdate><volume>9</volume><spage>99954</spage><epage>99967</epage><pages>99954-99967</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>Most embedding models used in natural language processing require retraining of the entire model to obtain the embedding value of a new word. In the current system, as retraining is repeated, the amount of data used for learning gradually increases. It is thus very inefficient to retrain the entire model whenever some new words emerge. Moreover, since a language has a huge number of words and its characteristics change continuously over time, it is not easy to embed all words. To solve both problems, we propose a new embedding model, the Mirroring Vector Space (MVS), which enables us to obtain a new word embedding by using the previously built word embedding model without retraining it. The MVS embedding model has a convolutional neural networks (CNN) structure and presents a novel strategy to obtain word embeddings. It predicts the embedding value of a word by learning the vector space of an existing embedding model using the explanations of the word. It also provides flexibility for external resources, reusability for training times, and portability in the point that our model can be used with any models. We verify these three attributes and the novelty in our experiments.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2021.3096238</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-3545-5991</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2169-3536
ispartof	IEEE access, 2021, Vol.9, p.99954-99967
issn	2169-3536 2169-3536
language	eng
recordid	cdi_proquest_journals_2553593405
source	IEEE Open Access Journals; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
subjects	Artificial intelligence Artificial neural networks Bit error rate Data models deep learning Dictionaries Embedding Hidden Markov models Learning Natural language processing neural networks new words Semantics Task analysis Training Vector space word embedding
title	Mirroring Vector Space Embedding for New Words
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T01%3A31%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Mirroring%20Vector%20Space%20Embedding%20for%20New%20Words&rft.jtitle=IEEE%20access&rft.au=Kim,%20Jihye&rft.date=2021&rft.volume=9&rft.spage=99954&rft.epage=99967&rft.pages=99954-99967&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2021.3096238&rft_dat=%3Cproquest_doaj_%3E2553593405%3C/proquest_doaj_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2553593405&rft_id=info:pmid/&rft_ieee_id=9481109&rft_doaj_id=oai_doaj_org_article_5b48ddd5f8d14c7b9cb75303e2240b56&rfr_iscdi=true