Schema matching based on SQL statements
Schema matching is a critical step in numerous database applications such as web data sources integrating, data warehouse loading and information exchanging among several authorities. In this paper, we propose to exploit the similarities of the SQL statements in the query logs to find the correspond...
Gespeichert in:
Veröffentlicht in: | Distributed and parallel databases : an international journal 2020-03, Vol.38 (1), p.193-226 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Schema matching is a critical step in numerous database applications such as web data sources integrating, data warehouse loading and information exchanging among several authorities. In this paper, we propose to exploit the similarities of the SQL statements in the query logs to find the
correspondences
between attributes in the schemas to be matched. We discover three kinds of similarities which benefit schema matching, that is, the similarity of clauses itself, the similarity of the frequency of clauses occurring in different SQL statements and the similarity of statistics about the relationship among clauses. We combine the clauses related to the similarities into a graph, and then transform the task of matching attributes into the problem of matching the graphs. Through matching the graphs, we obtain a set of attribute sequence pairs with the similarity score. Actually, each sequence pair represents a set of
correspondences
. Next, we exploit the techniques from the quadratic programming field to decompose the sequence pairs into
correspondences
, that is, to obtain the similarity score of each correspondence. Finally, an efficient method is used to choose the best
correspondence
for each attribute from the candidate set. The experimental study shows that the proposed approach is effective and its combination with other
matchers
has good performance. |
---|---|
ISSN: | 0926-8782 1573-7578 |
DOI: | 10.1007/s10619-019-07268-9 |