Linking data elements based on similarity data values and semantic annotations

Data elements from data sources and having a data value set are linked by using hash functions to determine a dimensionally reduced instance signature for each data element based on all data values associated with that data element to yield a plurality of dimensionally reduced instance signatures of...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Srinivas, Kavitha, Kementsietsidis, Anastasios, Ward, Michael James, Hassanzadeh, Oktie, Duan, Songyun, Fokoue-Nkoutche, Achille Belly, Bornea, Mihaela Ancuta
Format: Patent
Sprache:eng
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Srinivas, Kavitha
Kementsietsidis, Anastasios
Ward, Michael James
Hassanzadeh, Oktie
Duan, Songyun
Fokoue-Nkoutche, Achille Belly
Bornea, Mihaela Ancuta
description Data elements from data sources and having a data value set are linked by using hash functions to determine a dimensionally reduced instance signature for each data element based on all data values associated with that data element to yield a plurality of dimensionally reduced instance signatures of equivalent fixed size such that similarities among the data values in the data value sets across all data elements is maintained among the plurality of instance signatures. Candidate pairs of data elements to link are identified using the plurality of instance signatures in locality sensitive hash functions, and a similarity index is generated for each candidate pair using a pre-determined measure of similarity. Candidate pairs of data elements having a similarity index above a given threshold are linked.
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_US10229200B2</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>US10229200B2</sourcerecordid><originalsourceid>FETCH-epo_espacenet_US10229200B23</originalsourceid><addsrcrecordid>eNqNykEOAUEQRuHZWAjuUA4gaW1lS4iF2GA9KTO_SUV39UQViduTcACrl5d8w-qwF72JdtSyMyEhQ93owoaWipJJlsR38ddXPDk9YMTakiGzujSf0eLsUtTG1eDKyTD5dVRNt5vTejdDX2pYzw0UXp-P8xDjMoawiot_zBsjJzbO</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Linking data elements based on similarity data values and semantic annotations</title><source>esp@cenet</source><creator>Srinivas, Kavitha ; Kementsietsidis, Anastasios ; Ward, Michael James ; Hassanzadeh, Oktie ; Duan, Songyun ; Fokoue-Nkoutche, Achille Belly ; Bornea, Mihaela Ancuta</creator><creatorcontrib>Srinivas, Kavitha ; Kementsietsidis, Anastasios ; Ward, Michael James ; Hassanzadeh, Oktie ; Duan, Songyun ; Fokoue-Nkoutche, Achille Belly ; Bornea, Mihaela Ancuta</creatorcontrib><description>Data elements from data sources and having a data value set are linked by using hash functions to determine a dimensionally reduced instance signature for each data element based on all data values associated with that data element to yield a plurality of dimensionally reduced instance signatures of equivalent fixed size such that similarities among the data values in the data value sets across all data elements is maintained among the plurality of instance signatures. Candidate pairs of data elements to link are identified using the plurality of instance signatures in locality sensitive hash functions, and a similarity index is generated for each candidate pair using a pre-determined measure of similarity. Candidate pairs of data elements having a similarity index above a given threshold are linked.</description><language>eng</language><creationdate>2019</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20190312&amp;DB=EPODOC&amp;CC=US&amp;NR=10229200B2$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25564,76547</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20190312&amp;DB=EPODOC&amp;CC=US&amp;NR=10229200B2$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Srinivas, Kavitha</creatorcontrib><creatorcontrib>Kementsietsidis, Anastasios</creatorcontrib><creatorcontrib>Ward, Michael James</creatorcontrib><creatorcontrib>Hassanzadeh, Oktie</creatorcontrib><creatorcontrib>Duan, Songyun</creatorcontrib><creatorcontrib>Fokoue-Nkoutche, Achille Belly</creatorcontrib><creatorcontrib>Bornea, Mihaela Ancuta</creatorcontrib><title>Linking data elements based on similarity data values and semantic annotations</title><description>Data elements from data sources and having a data value set are linked by using hash functions to determine a dimensionally reduced instance signature for each data element based on all data values associated with that data element to yield a plurality of dimensionally reduced instance signatures of equivalent fixed size such that similarities among the data values in the data value sets across all data elements is maintained among the plurality of instance signatures. Candidate pairs of data elements to link are identified using the plurality of instance signatures in locality sensitive hash functions, and a similarity index is generated for each candidate pair using a pre-determined measure of similarity. Candidate pairs of data elements having a similarity index above a given threshold are linked.</description><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2019</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNqNykEOAUEQRuHZWAjuUA4gaW1lS4iF2GA9KTO_SUV39UQViduTcACrl5d8w-qwF72JdtSyMyEhQ93owoaWipJJlsR38ddXPDk9YMTakiGzujSf0eLsUtTG1eDKyTD5dVRNt5vTejdDX2pYzw0UXp-P8xDjMoawiot_zBsjJzbO</recordid><startdate>20190312</startdate><enddate>20190312</enddate><creator>Srinivas, Kavitha</creator><creator>Kementsietsidis, Anastasios</creator><creator>Ward, Michael James</creator><creator>Hassanzadeh, Oktie</creator><creator>Duan, Songyun</creator><creator>Fokoue-Nkoutche, Achille Belly</creator><creator>Bornea, Mihaela Ancuta</creator><scope>EVB</scope></search><sort><creationdate>20190312</creationdate><title>Linking data elements based on similarity data values and semantic annotations</title><author>Srinivas, Kavitha ; Kementsietsidis, Anastasios ; Ward, Michael James ; Hassanzadeh, Oktie ; Duan, Songyun ; Fokoue-Nkoutche, Achille Belly ; Bornea, Mihaela Ancuta</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_US10229200B23</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng</language><creationdate>2019</creationdate><toplevel>online_resources</toplevel><creatorcontrib>Srinivas, Kavitha</creatorcontrib><creatorcontrib>Kementsietsidis, Anastasios</creatorcontrib><creatorcontrib>Ward, Michael James</creatorcontrib><creatorcontrib>Hassanzadeh, Oktie</creatorcontrib><creatorcontrib>Duan, Songyun</creatorcontrib><creatorcontrib>Fokoue-Nkoutche, Achille Belly</creatorcontrib><creatorcontrib>Bornea, Mihaela Ancuta</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Srinivas, Kavitha</au><au>Kementsietsidis, Anastasios</au><au>Ward, Michael James</au><au>Hassanzadeh, Oktie</au><au>Duan, Songyun</au><au>Fokoue-Nkoutche, Achille Belly</au><au>Bornea, Mihaela Ancuta</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Linking data elements based on similarity data values and semantic annotations</title><date>2019-03-12</date><risdate>2019</risdate><abstract>Data elements from data sources and having a data value set are linked by using hash functions to determine a dimensionally reduced instance signature for each data element based on all data values associated with that data element to yield a plurality of dimensionally reduced instance signatures of equivalent fixed size such that similarities among the data values in the data value sets across all data elements is maintained among the plurality of instance signatures. Candidate pairs of data elements to link are identified using the plurality of instance signatures in locality sensitive hash functions, and a similarity index is generated for each candidate pair using a pre-determined measure of similarity. Candidate pairs of data elements having a similarity index above a given threshold are linked.</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language eng
recordid cdi_epo_espacenet_US10229200B2
source esp@cenet
title Linking data elements based on similarity data values and semantic annotations
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T22%3A30%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=Srinivas,%20Kavitha&rft.date=2019-03-12&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EUS10229200B2%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true