INFERRING JOINS FOR DATA SETS
Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for inferring joins for data sets. In some implementations, a first data table and a second data table are identified. A first subset of records are selected from the first data table and a second subset...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for inferring joins for data sets. In some implementations, a first data table and a second data table are identified. A first subset of records are selected from the first data table and a second subset of records are selected from the second data table. For fields of the first subset and the second subset, sets of feature values are generated indicating characteristics of the data in the fields. Based on the sets of feature values, one or more similarity score are determined, with each similarity score indicating a similarity of a column in the first data table with respect to a column in the second data table. Based on the one or more similarity scores, data indicating a recommendation to join one or more columns of the first data table with one or more columns of the second data table is provided for output by a computing device. |
---|