Implicit relation induction via purposeful overfitting of a word embedding model on a subset of a document corpus

A method overfits a word vector generating process to identify implicit relationships between two or more terms in a corpus. A server identifies instances of multiple user-generated pairs of terms in an original corpus of documents, in which the terms are labeled but a relationship between two or mo...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Stoyanovsky, Anastas, Yates, Robert L, Gheorghiu, Roxana
Format:	Patent
Sprache:	eng
Schlagworte:	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Stoyanovsky, Anastas Yates, Robert L Gheorghiu, Roxana
description	A method overfits a word vector generating process to identify implicit relationships between two or more terms in a corpus. A server identifies instances of multiple user-generated pairs of terms in an original corpus of documents, in which the terms are labeled but a relationship between two or more of the corpus terms are not identified. The server then extracts sentences, from the original corpus of documents, that contain one or more of the multiple user-generated pairs of terms, and combines the sentences into a training corpus, which is used to purposely overfit a word embedding model. This word embedding model leads to a vector that is used to identify other terms that have a same type of relationship as that found in the multiple user-generated pairs of terms, such that search corpus of documents can be searched for similar terms that trained the word embedding model.
format	Patent
fullrecord	<record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_US10885082B2</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>US10885082B2</sourcerecordid><originalsourceid>FETCH-epo_espacenet_US10885082B23</originalsourceid><addsrcrecordid>eNqNyk0KwjAQBeBuXIh6h_EAQq0IXSuKrtV1SZOJDCSZmEzq9a0_B3D1Hu990-px9tGRJoGETglxAAqm6E8bSEEsKXJGWxzwgMmSCIU7sAUFT04G0PdozHvzbHBUYXxy6TPKVxnWxWMQ0JxiyfNqYpXLuPjlrFoeD9f9aYWRO8xRaQwo3e2yrtt2W7fNrtn8Y17UeEP_</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Implicit relation induction via purposeful overfitting of a word embedding model on a subset of a document corpus</title><source>esp@cenet</source><creator>Stoyanovsky, Anastas ; Yates, Robert L ; Gheorghiu, Roxana</creator><creatorcontrib>Stoyanovsky, Anastas ; Yates, Robert L ; Gheorghiu, Roxana</creatorcontrib><description>A method overfits a word vector generating process to identify implicit relationships between two or more terms in a corpus. A server identifies instances of multiple user-generated pairs of terms in an original corpus of documents, in which the terms are labeled but a relationship between two or more of the corpus terms are not identified. The server then extracts sentences, from the original corpus of documents, that contain one or more of the multiple user-generated pairs of terms, and combines the sentences into a training corpus, which is used to purposely overfit a word embedding model. This word embedding model leads to a vector that is used to identify other terms that have a same type of relationship as that found in the multiple user-generated pairs of terms, such that search corpus of documents can be searched for similar terms that trained the word embedding model.</description><language>eng</language><subject>CALCULATING ; COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS ; COMPUTING ; COUNTING ; ELECTRIC DIGITAL DATA PROCESSING ; PHYSICS</subject><creationdate>2021</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20210105&DB=EPODOC&CC=US&NR=10885082B2$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,778,883,25547,76298</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20210105&DB=EPODOC&CC=US&NR=10885082B2$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Stoyanovsky, Anastas</creatorcontrib><creatorcontrib>Yates, Robert L</creatorcontrib><creatorcontrib>Gheorghiu, Roxana</creatorcontrib><title>Implicit relation induction via purposeful overfitting of a word embedding model on a subset of a document corpus</title><description>A method overfits a word vector generating process to identify implicit relationships between two or more terms in a corpus. A server identifies instances of multiple user-generated pairs of terms in an original corpus of documents, in which the terms are labeled but a relationship between two or more of the corpus terms are not identified. The server then extracts sentences, from the original corpus of documents, that contain one or more of the multiple user-generated pairs of terms, and combines the sentences into a training corpus, which is used to purposely overfit a word embedding model. This word embedding model leads to a vector that is used to identify other terms that have a same type of relationship as that found in the multiple user-generated pairs of terms, such that search corpus of documents can be searched for similar terms that trained the word embedding model.</description><subject>CALCULATING</subject><subject>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>ELECTRIC DIGITAL DATA PROCESSING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2021</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNqNyk0KwjAQBeBuXIh6h_EAQq0IXSuKrtV1SZOJDCSZmEzq9a0_B3D1Hu990-px9tGRJoGETglxAAqm6E8bSEEsKXJGWxzwgMmSCIU7sAUFT04G0PdozHvzbHBUYXxy6TPKVxnWxWMQ0JxiyfNqYpXLuPjlrFoeD9f9aYWRO8xRaQwo3e2yrtt2W7fNrtn8Y17UeEP_</recordid><startdate>20210105</startdate><enddate>20210105</enddate><creator>Stoyanovsky, Anastas</creator><creator>Yates, Robert L</creator><creator>Gheorghiu, Roxana</creator><scope>EVB</scope></search><sort><creationdate>20210105</creationdate><title>Implicit relation induction via purposeful overfitting of a word embedding model on a subset of a document corpus</title><author>Stoyanovsky, Anastas ; Yates, Robert L ; Gheorghiu, Roxana</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_US10885082B23</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng</language><creationdate>2021</creationdate><topic>CALCULATING</topic><topic>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>ELECTRIC DIGITAL DATA PROCESSING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>Stoyanovsky, Anastas</creatorcontrib><creatorcontrib>Yates, Robert L</creatorcontrib><creatorcontrib>Gheorghiu, Roxana</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Stoyanovsky, Anastas</au><au>Yates, Robert L</au><au>Gheorghiu, Roxana</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Implicit relation induction via purposeful overfitting of a word embedding model on a subset of a document corpus</title><date>2021-01-05</date><risdate>2021</risdate><abstract>A method overfits a word vector generating process to identify implicit relationships between two or more terms in a corpus. A server identifies instances of multiple user-generated pairs of terms in an original corpus of documents, in which the terms are labeled but a relationship between two or more of the corpus terms are not identified. The server then extracts sentences, from the original corpus of documents, that contain one or more of the multiple user-generated pairs of terms, and combines the sentences into a training corpus, which is used to purposely overfit a word embedding model. This word embedding model leads to a vector that is used to identify other terms that have a same type of relationship as that found in the multiple user-generated pairs of terms, such that search corpus of documents can be searched for similar terms that trained the word embedding model.</abstract><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier
ispartof
issn
language	eng
recordid	cdi_epo_espacenet_US10885082B2
source	esp@cenet
subjects	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
title	Implicit relation induction via purposeful overfitting of a word embedding model on a subset of a document corpus
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T12%3A55%3A35IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=Stoyanovsky,%20Anastas&rft.date=2021-01-05&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EUS10885082B2%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true