Inferring joins for data sets

Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for inferring joins for data sets. In some implementations, a first data table and a second data table are identified. A first subset of records are selected from the first data table and a second subset...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Yu, Nannan, Pineda, Mohamed Diakite, Huang, Ren-Jay
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Yu, Nannan
Pineda, Mohamed Diakite
Huang, Ren-Jay
description Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for inferring joins for data sets. In some implementations, a first data table and a second data table are identified. A first subset of records are selected from the first data table and a second subset of records are selected from the second data table. For fields of the first subset and the second subset, sets of feature values are generated indicating characteristics of the data in the fields. Based on the sets of feature values, one or more similarity score are determined, with each similarity score indicating a similarity of a column in the first data table with respect to a column in the second data table. Based on the one or more similarity scores, data indicating a recommendation to join one or more columns of the first data table with one or more columns of the second data table is provided for output by a computing device.
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_US11604797B2</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>US11604797B2</sourcerecordid><originalsourceid>FETCH-epo_espacenet_US11604797B23</originalsourceid><addsrcrecordid>eNrjZJD1zEtLLSrKzEtXyMrPzCtWSMsvUkhJLElUKE4tKeZhYE1LzClO5YXS3AyKbq4hzh66qQX58anFBYnJqXmpJfGhwYaGZgYm5pbmTkbGxKgBAL_fJCU</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Inferring joins for data sets</title><source>esp@cenet</source><creator>Yu, Nannan ; Pineda, Mohamed Diakite ; Huang, Ren-Jay</creator><creatorcontrib>Yu, Nannan ; Pineda, Mohamed Diakite ; Huang, Ren-Jay</creatorcontrib><description>Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for inferring joins for data sets. In some implementations, a first data table and a second data table are identified. A first subset of records are selected from the first data table and a second subset of records are selected from the second data table. For fields of the first subset and the second subset, sets of feature values are generated indicating characteristics of the data in the fields. Based on the sets of feature values, one or more similarity score are determined, with each similarity score indicating a similarity of a column in the first data table with respect to a column in the second data table. Based on the one or more similarity scores, data indicating a recommendation to join one or more columns of the first data table with one or more columns of the second data table is provided for output by a computing device.</description><language>eng</language><subject>CALCULATING ; COMPUTING ; COUNTING ; ELECTRIC DIGITAL DATA PROCESSING ; PHYSICS</subject><creationdate>2023</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20230314&amp;DB=EPODOC&amp;CC=US&amp;NR=11604797B2$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25564,76419</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20230314&amp;DB=EPODOC&amp;CC=US&amp;NR=11604797B2$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Yu, Nannan</creatorcontrib><creatorcontrib>Pineda, Mohamed Diakite</creatorcontrib><creatorcontrib>Huang, Ren-Jay</creatorcontrib><title>Inferring joins for data sets</title><description>Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for inferring joins for data sets. In some implementations, a first data table and a second data table are identified. A first subset of records are selected from the first data table and a second subset of records are selected from the second data table. For fields of the first subset and the second subset, sets of feature values are generated indicating characteristics of the data in the fields. Based on the sets of feature values, one or more similarity score are determined, with each similarity score indicating a similarity of a column in the first data table with respect to a column in the second data table. Based on the one or more similarity scores, data indicating a recommendation to join one or more columns of the first data table with one or more columns of the second data table is provided for output by a computing device.</description><subject>CALCULATING</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>ELECTRIC DIGITAL DATA PROCESSING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2023</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZJD1zEtLLSrKzEtXyMrPzCtWSMsvUkhJLElUKE4tKeZhYE1LzClO5YXS3AyKbq4hzh66qQX58anFBYnJqXmpJfGhwYaGZgYm5pbmTkbGxKgBAL_fJCU</recordid><startdate>20230314</startdate><enddate>20230314</enddate><creator>Yu, Nannan</creator><creator>Pineda, Mohamed Diakite</creator><creator>Huang, Ren-Jay</creator><scope>EVB</scope></search><sort><creationdate>20230314</creationdate><title>Inferring joins for data sets</title><author>Yu, Nannan ; Pineda, Mohamed Diakite ; Huang, Ren-Jay</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_US11604797B23</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng</language><creationdate>2023</creationdate><topic>CALCULATING</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>ELECTRIC DIGITAL DATA PROCESSING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>Yu, Nannan</creatorcontrib><creatorcontrib>Pineda, Mohamed Diakite</creatorcontrib><creatorcontrib>Huang, Ren-Jay</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yu, Nannan</au><au>Pineda, Mohamed Diakite</au><au>Huang, Ren-Jay</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Inferring joins for data sets</title><date>2023-03-14</date><risdate>2023</risdate><abstract>Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for inferring joins for data sets. In some implementations, a first data table and a second data table are identified. A first subset of records are selected from the first data table and a second subset of records are selected from the second data table. For fields of the first subset and the second subset, sets of feature values are generated indicating characteristics of the data in the fields. Based on the sets of feature values, one or more similarity score are determined, with each similarity score indicating a similarity of a column in the first data table with respect to a column in the second data table. Based on the one or more similarity scores, data indicating a recommendation to join one or more columns of the first data table with one or more columns of the second data table is provided for output by a computing device.</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language eng
recordid cdi_epo_espacenet_US11604797B2
source esp@cenet
subjects CALCULATING
COMPUTING
COUNTING
ELECTRIC DIGITAL DATA PROCESSING
PHYSICS
title Inferring joins for data sets
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T23%3A10%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=Yu,%20Nannan&rft.date=2023-03-14&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EUS11604797B2%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true