Mirroring Vector Space Embedding for New Words
Most embedding models used in natural language processing require retraining of the entire model to obtain the embedding value of a new word. In the current system, as retraining is repeated, the amount of data used for learning gradually increases. It is thus very inefficient to retrain the entire...
Gespeichert in:
Veröffentlicht in: | IEEE access 2021, Vol.9, p.99954-99967 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 99967 |
---|---|
container_issue | |
container_start_page | 99954 |
container_title | IEEE access |
container_volume | 9 |
creator | Kim, Jihye Jeong, Ok-Ran |
description | Most embedding models used in natural language processing require retraining of the entire model to obtain the embedding value of a new word. In the current system, as retraining is repeated, the amount of data used for learning gradually increases. It is thus very inefficient to retrain the entire model whenever some new words emerge. Moreover, since a language has a huge number of words and its characteristics change continuously over time, it is not easy to embed all words. To solve both problems, we propose a new embedding model, the Mirroring Vector Space (MVS), which enables us to obtain a new word embedding by using the previously built word embedding model without retraining it. The MVS embedding model has a convolutional neural networks (CNN) structure and presents a novel strategy to obtain word embeddings. It predicts the embedding value of a word by learning the vector space of an existing embedding model using the explanations of the word. It also provides flexibility for external resources, reusability for training times, and portability in the point that our model can be used with any models. We verify these three attributes and the novelty in our experiments. |
doi_str_mv | 10.1109/ACCESS.2021.3096238 |
format | Article |
fullrecord | <record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_proquest_journals_2553593405</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9481109</ieee_id><doaj_id>oai_doaj_org_article_5b48ddd5f8d14c7b9cb75303e2240b56</doaj_id><sourcerecordid>2553593405</sourcerecordid><originalsourceid>FETCH-LOGICAL-c408t-6d7fb695f03a96189e00f900d82524272597ef21c102942307380bbff81432d33</originalsourceid><addsrcrecordid>eNpNUMtKw0AUHUTBUvsF3QRcJ955JTPLEqoWqi7qYzlkXiWl7dRJi_j3TkwR7-ZeDudxOQhNMRQYg7yb1fV8tSoIEFxQkCWh4gKNCC5lTjktL__d12jSdRtIIxLEqxEqntoYQ2z36-zdmWOI2erQGJfNd9pZ28M-Yc_uK_sI0XY36Mo3285NznuM3u7nr_Vjvnx5WNSzZW4YiGNe2srrFOCBNrLEQjoALwGsIJwwUhEuK-cJNhiIZIRCRQVo7b3AjBJL6RgtBl8bmo06xHbXxG8Vmlb9AiGuVROPrdk6xTUT1lruhcXMVFoaXXEK1BHCQPMyed0OXocYPk-uO6pNOMV9el8RzimXlAFPLDqwTAxdF53_S8Wg-p7V0LPqe1bnnpNqOqha59yfQjLRK-gPl8d1Lg</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2553593405</pqid></control><display><type>article</type><title>Mirroring Vector Space Embedding for New Words</title><source>IEEE Open Access Journals</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Kim, Jihye ; Jeong, Ok-Ran</creator><creatorcontrib>Kim, Jihye ; Jeong, Ok-Ran</creatorcontrib><description>Most embedding models used in natural language processing require retraining of the entire model to obtain the embedding value of a new word. In the current system, as retraining is repeated, the amount of data used for learning gradually increases. It is thus very inefficient to retrain the entire model whenever some new words emerge. Moreover, since a language has a huge number of words and its characteristics change continuously over time, it is not easy to embed all words. To solve both problems, we propose a new embedding model, the Mirroring Vector Space (MVS), which enables us to obtain a new word embedding by using the previously built word embedding model without retraining it. The MVS embedding model has a convolutional neural networks (CNN) structure and presents a novel strategy to obtain word embeddings. It predicts the embedding value of a word by learning the vector space of an existing embedding model using the explanations of the word. It also provides flexibility for external resources, reusability for training times, and portability in the point that our model can be used with any models. We verify these three attributes and the novelty in our experiments.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2021.3096238</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Artificial intelligence ; Artificial neural networks ; Bit error rate ; Data models ; deep learning ; Dictionaries ; Embedding ; Hidden Markov models ; Learning ; Natural language processing ; neural networks ; new words ; Semantics ; Task analysis ; Training ; Vector space ; word embedding</subject><ispartof>IEEE access, 2021, Vol.9, p.99954-99967</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c408t-6d7fb695f03a96189e00f900d82524272597ef21c102942307380bbff81432d33</citedby><cites>FETCH-LOGICAL-c408t-6d7fb695f03a96189e00f900d82524272597ef21c102942307380bbff81432d33</cites><orcidid>0000-0002-3545-5991</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9481109$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,860,2096,4010,27610,27900,27901,27902,54908</link.rule.ids></links><search><creatorcontrib>Kim, Jihye</creatorcontrib><creatorcontrib>Jeong, Ok-Ran</creatorcontrib><title>Mirroring Vector Space Embedding for New Words</title><title>IEEE access</title><addtitle>Access</addtitle><description>Most embedding models used in natural language processing require retraining of the entire model to obtain the embedding value of a new word. In the current system, as retraining is repeated, the amount of data used for learning gradually increases. It is thus very inefficient to retrain the entire model whenever some new words emerge. Moreover, since a language has a huge number of words and its characteristics change continuously over time, it is not easy to embed all words. To solve both problems, we propose a new embedding model, the Mirroring Vector Space (MVS), which enables us to obtain a new word embedding by using the previously built word embedding model without retraining it. The MVS embedding model has a convolutional neural networks (CNN) structure and presents a novel strategy to obtain word embeddings. It predicts the embedding value of a word by learning the vector space of an existing embedding model using the explanations of the word. It also provides flexibility for external resources, reusability for training times, and portability in the point that our model can be used with any models. We verify these three attributes and the novelty in our experiments.</description><subject>Artificial intelligence</subject><subject>Artificial neural networks</subject><subject>Bit error rate</subject><subject>Data models</subject><subject>deep learning</subject><subject>Dictionaries</subject><subject>Embedding</subject><subject>Hidden Markov models</subject><subject>Learning</subject><subject>Natural language processing</subject><subject>neural networks</subject><subject>new words</subject><subject>Semantics</subject><subject>Task analysis</subject><subject>Training</subject><subject>Vector space</subject><subject>word embedding</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUMtKw0AUHUTBUvsF3QRcJ955JTPLEqoWqi7qYzlkXiWl7dRJi_j3TkwR7-ZeDudxOQhNMRQYg7yb1fV8tSoIEFxQkCWh4gKNCC5lTjktL__d12jSdRtIIxLEqxEqntoYQ2z36-zdmWOI2erQGJfNd9pZ28M-Yc_uK_sI0XY36Mo3285NznuM3u7nr_Vjvnx5WNSzZW4YiGNe2srrFOCBNrLEQjoALwGsIJwwUhEuK-cJNhiIZIRCRQVo7b3AjBJL6RgtBl8bmo06xHbXxG8Vmlb9AiGuVROPrdk6xTUT1lruhcXMVFoaXXEK1BHCQPMyed0OXocYPk-uO6pNOMV9el8RzimXlAFPLDqwTAxdF53_S8Wg-p7V0LPqe1bnnpNqOqha59yfQjLRK-gPl8d1Lg</recordid><startdate>2021</startdate><enddate>2021</enddate><creator>Kim, Jihye</creator><creator>Jeong, Ok-Ran</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-3545-5991</orcidid></search><sort><creationdate>2021</creationdate><title>Mirroring Vector Space Embedding for New Words</title><author>Kim, Jihye ; Jeong, Ok-Ran</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c408t-6d7fb695f03a96189e00f900d82524272597ef21c102942307380bbff81432d33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Artificial intelligence</topic><topic>Artificial neural networks</topic><topic>Bit error rate</topic><topic>Data models</topic><topic>deep learning</topic><topic>Dictionaries</topic><topic>Embedding</topic><topic>Hidden Markov models</topic><topic>Learning</topic><topic>Natural language processing</topic><topic>neural networks</topic><topic>new words</topic><topic>Semantics</topic><topic>Task analysis</topic><topic>Training</topic><topic>Vector space</topic><topic>word embedding</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kim, Jihye</creatorcontrib><creatorcontrib>Jeong, Ok-Ran</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kim, Jihye</au><au>Jeong, Ok-Ran</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Mirroring Vector Space Embedding for New Words</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2021</date><risdate>2021</risdate><volume>9</volume><spage>99954</spage><epage>99967</epage><pages>99954-99967</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>Most embedding models used in natural language processing require retraining of the entire model to obtain the embedding value of a new word. In the current system, as retraining is repeated, the amount of data used for learning gradually increases. It is thus very inefficient to retrain the entire model whenever some new words emerge. Moreover, since a language has a huge number of words and its characteristics change continuously over time, it is not easy to embed all words. To solve both problems, we propose a new embedding model, the Mirroring Vector Space (MVS), which enables us to obtain a new word embedding by using the previously built word embedding model without retraining it. The MVS embedding model has a convolutional neural networks (CNN) structure and presents a novel strategy to obtain word embeddings. It predicts the embedding value of a word by learning the vector space of an existing embedding model using the explanations of the word. It also provides flexibility for external resources, reusability for training times, and portability in the point that our model can be used with any models. We verify these three attributes and the novelty in our experiments.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2021.3096238</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-3545-5991</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2169-3536 |
ispartof | IEEE access, 2021, Vol.9, p.99954-99967 |
issn | 2169-3536 2169-3536 |
language | eng |
recordid | cdi_proquest_journals_2553593405 |
source | IEEE Open Access Journals; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals |
subjects | Artificial intelligence Artificial neural networks Bit error rate Data models deep learning Dictionaries Embedding Hidden Markov models Learning Natural language processing neural networks new words Semantics Task analysis Training Vector space word embedding |
title | Mirroring Vector Space Embedding for New Words |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T01%3A31%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Mirroring%20Vector%20Space%20Embedding%20for%20New%20Words&rft.jtitle=IEEE%20access&rft.au=Kim,%20Jihye&rft.date=2021&rft.volume=9&rft.spage=99954&rft.epage=99967&rft.pages=99954-99967&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2021.3096238&rft_dat=%3Cproquest_doaj_%3E2553593405%3C/proquest_doaj_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2553593405&rft_id=info:pmid/&rft_ieee_id=9481109&rft_doaj_id=oai_doaj_org_article_5b48ddd5f8d14c7b9cb75303e2240b56&rfr_iscdi=true |